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00 (54) ritle: DIRECTED EVOLUTION METHOD 

n 

*^ (57) Abstract: We describe a method of selecting an enzyme having replicase activity, the method comprising the steps of: (a) 
providing a pool of nucleic acids comprising members each encoding a replicase or a variant of the replicase; (b) subdividing the 
pool of nacleic acids into oompaitments, such that each oompartment comprises a nucleic add member of the pool together with the 

O replicase or variant encoded by the nucldc acid member, (c) allowing nucleic add replication to occur; and (d) detecting amplifi- 

^ cation of Uie nuddc add member by the replicase. Methods for selecting agents capable of modulating rq>licase activity, and for 

^ selecting interacting polypeptides are also disclosed. 
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DIRECTED EVOLUTION METHOD 

FiEU) OF THE Invention 

The present invention relates to mediods for use in /» vftro evolution of molecular 
libraries. In particular, the present invendon relates to methods of selecting nucleic adds 
5 encoding gene products in wbidhi the nudac acid and the activity of tiie aicoded gene product 
are linked by conq>artmentalisation. 

Background to the Invention 

Evolution requires the generation of genetic diversity (diversity in nucleic add) 
followed by the selection of tiiose nucleic acids which encode benefidal characteristics. 

10 Because tiie activity of the nucleic adds and their encoded gene product are physically Unked 
in biological orgamsms (the nucleic adds encodmg the molecular bluq)rint of the cells in 
which they are confined), alterations m the genotype resulting m an adaptive change(s) of 
phenotype produce benefits for the organism resulting m mcreased survival and offepring. 
Multiple rounds of mutation and selection can thus result m the progressive rairichment of 

15 organisms (and the encoding genotype) witii increasing adaptation to a given selection 
condition. Systems for rapid evolution of nucleic acids or proteins in vitro must mimic tiiis 
process at the molecular level in tiiat the nucldc acid and the activity of tiie.oicoded gene 
product must be linked and tiiB activity of die gene inoduct must be selectable. 

20 hi vitro selection technologies are a rq)idly e}q)anding field and oftei irove more 

powerful tiian rational design to obtain Wopolymas with desired properties. In tiie past 
decade selection e:q)erimmts, usmg e.g. phage display or SELKC tedinolog^es have yielded 
maity novel polynucleotide and potypqitide ligands. Selection for catalysis has proved baidor. 
Strategies have included bmding of transition state analogues, covalent Imkage to suidde 

25 inhibitors, proxknitycoxq)ling and covalfflitprodud linkage. Alfliou^ these approach 

only on a particular part of tiie enzymatic cycle, there have been some successes. Ultimatety 
however it would be desirable to select dhectiy for catalytic turnover. Indeed, ample 
screenmg for catalytic turnover of fairly small mutant libraries has beai raflie* more 
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successftd Iban the various selection approaches and has yielded some catalysts with greatly 
inq;>roved catalytic rates. 

While polymerases are a prerequiate for technologies that define molecular biology, 
5 i.e. site^ected mutagenesis, cDNA cloning and in particular Sanger sequencing and PGR, 
they often suflEer from serious shortcomings due to the feet that they are made to perform tasks 
for which nature has not optimized tiiem. Few attempts tqppear to have been made to improve 
the properties of polymerases available from nature and to tailor tiiem for specific ^plications 
by protem engmeering. Technical advances have been largely peripheral, and mclude die use 
10 of polymerases fiom a wider range of organisms, buffer and additive systems as well as 
enzyme blends. 

Attempts to improve the properties of polymerases have traditionally reUed on protem 
engineering. For example, variants of Taq polymerase (for example, Stoffel fragment and 
Klentaq) have been generated by full or partial deletion of its 5'-3' exonuclease domam and 

15 show improved thermostabiUty and fideUly altiiough at the cost of reduced processivity 
(Barnes 1992, Gem 1 12, 29-35, Lawyer et al, 1993, PCR Methods and Applications 2, 275). 
hi addition, tiie availability of high-resolution structures for protems has allowed die rational 
design of mutants witii unproved properties (for example, Taq mutants with unproved 
properties of dideoxynucleotide incorporation for cycle sequendng, li et al., 1999, Proc. Natl. 

20 Acad. Sci USA 96, 9491). In vivo genetic jqjproadies have also been used for protem design, 
for example by conq)lementation of a polA' stram to select for active polymerases from 
repertoires of mutant polymerases (Suzuki et aL, 1996 Proc Natl. Acad Sci USA 93, 9670). 
However, the genetic ccnnidementation approadi is Iknited in die properties tiiat can be 
selected for. 

25 Hecent advances in molecular biology have allowed some molecules to be co-selected 

in vitro accordmg to thdr properties along witii tibe nocldc adds that encode them. The 
selected nucleic adds can subsequentiy be doned fia fiirdier analyds or use, or subjected to 
additional rounds of mutation and selection. Common to these mdhods is the establishment of 
large libraries of nudeic acids. Molecules havmg tiie desired diaracteristics (activity) can be 
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isolated Ihrougb selection regimes that select for the desired activity of the encoded gene 
product, such as a desired biochemical or biological activity, for example binding activity. 

W099AJ2671 desraibes a method for isolating one or more genetic elanents encoding 
a gene product having a desired activity. Genetic elements are first compartmentalised into 

5 microcapsules, and then transcribed and/or translated to produce their respective gene 
products (RNA or protdn) wifliin the miax)cq>sules. Altemativefy, the genetic elements are 
contained wiflnn a host cdl m which transcription and/or translation (esqnession) of tiw ffoas 
product takes place and the host cells are first conqMTtmentalisedin^ Genetic 
elements which produce gene product havmg desired activity are subsequenfly sorted. The 

10 method described in WO99y02671 reUes on tiie gene product catalytically modifying the 
nucroc^)sule or the genetic element (or both), so that enrichment of the modified entity or 
• entities enables selection oftiie desued activity. 

Summary of the invention 

According to a first aspect of the present mvention, we provide a metiiod of selecting a 
15 nucleic acid-processing (NAP) enzyme, the mefliod comprising the steps of: (a) providing a 
pool of nucleic acids comprismg members encodii^ a NAP enzyme or a variant of the NAP 
enzyme; (b) subdividing the pool of nucleic acids into coiaapartments, such that each 
compartment comprises a nucleic acid member of the pool togetiier witii the NAP enzyme or 
variant encoded by tiie nucleic acid memba:; (c) allowing nucleic add processing to occur, 
20 and (d) detecting processing ofihe nucleic acid member by the NAP enagone. 

There is provided, according to a second aspect of the present invention, a method of 
selecting an agent capable of modifying the activity of a NAP enzyme, tiie metiiod comprising 
tiie steps of: (a) providing a NAP enzyme; (b) providing a pool of nucleic acids comprising 
members encoding one or more candidate agents; (c) subdividing tiie pool of nucleic acids 
25 into compartments, such tiiat each con^artmait comprises a nucleic acid member of tiie pool, 
tiie agent encoded by tiie nnclac add member, and tiie NAP enzyme and (<0 detecting 
processing of tiie nuddc acid monber by tiie NAP enzyme. 
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Preferably, the agent is a promoter of NAP enzyme acti\dty. The agent may be an 
enzyme, preferably a kinase or a phosphorylase, which is capable of acting on the NAP 
enzyme to modify its activity. The agent may be a chaperone involved in the folding or 
assembly of the NAR enzyme or required for the maintenance of replicase function (e.g. 

5 telomwase, HSP 90), Alternatively, the agent may be a polypeptide or polynucleotide 
involved in a metabolic palfaws^, the patfawsQr having as an end product a substrate vMch is 
involved in a replication reaction. The ag^ naay moreover be any epzyme which is capable 
of catalysing a reaction that modifies an inhibiting a^nt (natural or unnatural) of the NAP 
enzyme in sudi a wsQr as to reduce or abolish its inhibiting activity. Finally the a^nt may 

10 promote NAP activity in a non-catalytic way, e.g. by association witii the NAP enzyme or its 
substrate etc. (e.g. processivity fectors in tiie case of DNA polymerases, e.g. T7 DNA 
polymerase & tiuoredoxin). 

We provide, accordmg to a third aspect of tiie present mvention, a metiiod of selecting 
a pair of polypeptides cq>able of stable interaction, tiie method comprismg: (a) providmg a 

IS first nucleic acid and a second nucleic acid, the first nucleic acid encoding a first fiision 
protem con^aising a first subdomain of a NAP enzyme fiised to a first polypqptide, the 
second nucleic acid encoding a second fiision protdn conq)rising a second subdomain of a 
NAP enzyme fiised to a second polypeptide; in ^ch stable interaction of the first and second 
NAP enzyme subdomains generates NAP enzyme activity, and in which at least one of the 

20 first and second nucleic acids is provided in the form of a pool of nucleic acids encoding 
variants of the respective first and/or second polypeptide(s); (b) subdividing the pool or pools 
of nucleic acids into compartments, sudi that each compartment comprises a first nucleic 
acid and a second nucleic acid together with respective fiision proteins encoded by the first 
and second nucleic acids; (c) allowing the first polypeptide to bind to. the second polypeptide, 

25 such fliat bmding of tiie first and second polypeptides leads to stable mteraction of ttie NAP 
en2yme subdomains to generate NAP enzyme activity, and (d) detecting processmg of at least 
one of the first and second nucleic acids by the NAP enzyme. 

Moreover, tiie NAP enzyme domams referred to in (a) above may be replaced with 
domams of a polypeptide ci^le of modi^^ the activity of NAP enzymes, as discussed in 
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the second aspect of the present invention, and NAP enzyme activity used to select such 
modifying polypeptides hamg desired properties. 

Preferably^ each of the first and second nucleic acids is provided fix>m a pool of 
nucleic acids. 

S Preferably, the first and second nucleic adds are linked either covalently (e.g. as part 

of the same tenq>late molecule) or non-covalently (e.g. by tethering onto beads etc.). 

NAP enzymes may for exanq)le be polypeptide or ribonucleic acid enzyme molecules. 
In ahigiUy preferred embodiment, the NAP enzyme according to the invention is a replicase 
enzymei, i.e. an enzyme, "which is capable of amplifying nucleic acid firom a template, such as 

10 fin: example a polymerase eni^me (or lig^se). The invention is described herem below with 
specific reference to rq)licases; however, it will be understood by those skilled in the art that 
the mvmtion is equally q)plicable to other NAP enzymes, such as telomerases and helicases, 
as fiirther set out below, ^^ch process nucleic acids in ways not limited to amplification but 
vMch are nevertheless selectable by detecting nucleic add amplification, i.e. which promote 

IS replication indirectly. 

Jn a preferred embodimCTt of the invention, amplification of the nucldc add results 
fix>m more than one round of nucleic acid replicatiorL Preferably, the aniplification of the 
nucleic acid is an exponential amplification. 

Tlie amplification reaction is preferably selected finom the following: a polymerase 
20 chain reaction ^CR), a reverse transcriptase-polymerase chain reaction (RT-PCR), a nested 
PGR, a ligase chain reaction (LCR), a transcription based amplification system (TAS), a self- 
sustaining sequence replication (3SR), NASBA, a transcription-mediated amplification 
reaction (TMA), and a strand-displac^ent an]|>lification (SDA). 

In a highly preferred embodiment, flie post-^plification copy number of the nucldc 
25 add member is substantially proportional to the iactivity of the replicase, the activity of a 
- requisite agent, buidixigafBnityoffiie first and second 
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Nucleic acid replication may be detected by assaying the copy number of the nucleic 
acid member. Alternatively, or in addition, nucleic acid replication may be detected by 
determining the activity of a polypeptide encoded by the nucleic acid member. 

In a highly preferred embodiment, the conditions in the compartment are adjusted to 
5 select for a leplicase or agent active under such conditions, or a pair of polypeptides capable 
of stable interaction under such conditions. 

The leplicase preferably has polymerase, reverse transcriptase or ligase activity. 

The polyi^ptide may be provided fix>m the nucleic add by m vitro transcription and 
translatioxL Alternatively, the polypeptide may be provided from tiie nucleic add in vivo in an 
10 esqpressionhost. 

In a preferred embodiment, tiie compartments consist of the enaq>sulated aqueous 
component of a ivater-in-oil emulsioa The water-m-oil emulsion is preferably produced by 
emulsifying an aqueous phase with an oil phase in the presence of a sur&ctant comprising 
4.5% v/v Span 80, 0.4% v/v Tween 80 and 0.1% v/v Triton XlOO, or a smfectant comprising 
15 Span 80, Tween 80 and Triton XlOO in substantially tiie same proportions. Preferably, tiie 
water:oil phase ratio is 1:2, vMch leads to adequate droplet size. Such emulsions have a 
higher thermal stability than more oil-rich emulsions. 

As a fourth aspect of the present invention, there is provided a replicase enzyme 
identified by a method according to any preceding claim. Preferably, the replicase enzyme has 
20 a gieater thermostability than a corresponding unselected enzyme. More preferably, the 
replicase oizyme is a Tag polymerase having more than 10 times increased half-life at 97.S 
viben compared to wild type Tag polymerase. 

The replicase en^me may have a gceato: tolerance to heparin than a corresponding 
unselected enzyme. Preferably, the rqplicase enzyme is a Tag polymerase active at a 
25 concentration of 0.083 unita/^1 or more of heparin 
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The replicase en2yine may be capable of extending a primer having a 3' mismatch. 
Preferably, the 3' mismatch is a 3 ■ purine-purine mismatch or a 3' pyrimidme-pyrimidme 
mismatch. More preferably, the 3' mismatch is an A-G mismatch or the 3* mismatch is a C-C 
mismatch. 

5 We provide, according to a fifth aspect of the present invention, a Taq polymerase 

mutant comprismg the mutations (amino acid substitutions): F73S, R205K, K219E, M236T» 
E434DandA608V. 

The present invention, in a sixth aspect provides a Taq polymerase mutant compri^ 
tiie mutations (amino add substitutions): K225E, E388V, K540R, DS780, N583S and 
10 M747R. 

The present invention, in a seventh aspect, provides a Taq polymerase mutant comprising the 
mutations (amino acid substitutions): Q84A, D144G, K314R, E520G, A608V, E742a 

The present invention, in a eighth aspect, provides a Taq polymerase mutant comprismg the 
mutations (amino acid substitutions): D58G, R74P, A109T, L245R, R343G, G370D, E520G, 
15 N583S,E694K,A743P. 

hi a ninth aspect of the present invention, there is provided a water-in-oil emulsion 
obtainable by emulsifying an aqueous phase wifli an oil phase in tiie presence of a surfactant 
comprismg 4.5% v/v Span 80, 0.4% v/v Tween 80 and 0.1% v/v Triton XlOO, or a suifiwtant 
comprismg Span 80, Tween 80 and Triton XlOO m substantially the same proportions. 
20 Preferably, tiie water:oil phase ratio is 1:2. This ratio appears to permit difl&jsion of dNTPs 
(and presumably other small molecules) between compartments at higher temperatures, which 
is beneficial for some applications but not for others. Difiiision can be coiitrolled by 
increasii^ watenoil phase ratio to 1:4. 
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Brief DESCRiP'nQN of the Drawings 

Figure 1 A is a diagrtoi showing an embodiment of a method according to the present 
mvemtion as applied to selection of a self-evolving polymerase, in which gene copy number is 
Unked to enzymatic turnover. 

5 Figure IB is a diagram showing a general scheme of coDq>arbnratalised self- 

replication (CSR): 1) A repertoire of diversified polymerase gsaes is cloned and e^qxressed m 
EcolL Sph«:es represent active polymerase molecules. 2) Bacterial cells containing the 
polymerase and encoding gene are suspended in reaction buffer containing flanking primers 
and nucleotide triphosphates (dNTPs) and segregated uito aqueous compartments. 3) The 
10 polymerase enzyme and encoding gene are released fix)m the cell allovving self-replication to 
proceed. Poorly active polymerases (white hexagon) fidl to replicate then: encoding ^e. 4) 
The ''of&pringi^ polymerase genes are released, rediversified and recloned for another cycle of 
CSR. 

Figure 2 is a die^ram showing aqueous compartments of the heat-stable emulsion 
15 containing Rcoli cells expressing green fluorescent protein (GFP) prior to (A, B), and after 
thennocycling (C), as imaged by light microscopy. (A, B) represent the same firame. (A) is 
imaged at 535 nm for GFP fluorescence and (B) in visible light to visualize bacterial cells 
within compartments. Smudging of the fluorescent bacteria in (A) is due to Brownian motion 
during exposure. Average compartment dimensions as determined by laser dif&action are 
20 given below. 

Figure 3A is a diagram showing crossover between emulsion compartments. Two 
standard PGR reactions, diflfering m template size (PCRl (0.9kb), PCR2 (0.3 kb)) and 
presence of Tag (pCKl : + Tag, PGR 2: no enzyme), are anq)lified mdividually or combined. 
When combmed in solution, botii templates are amplified. When emulsified separately, prior 
25 to mixh^ only PCRl is amplified. M: ^Xn4 HaeTIL marker 

Figure SB is a diagram showing crossover between emulsion conq>artments. Bacterial 
cells expressing wild4ype Tag polymerase (2.7kb) or the Tag polymerase Stoffel fi:agment 



wo 02/22869 



PCT/GBOl/04108 



9 

(poorly active under the buffer conditions) (1.8kb) are mixed 1:1 prior to emulsification. In 
solution, the shorter Stoffel ft^ent is amplified preferentially. In emulsion, there is 
predominantly amplification of the wt Taq gene and only weak amplification of the Stoffel 
fiagment (arrow). M: XHindBl marker 

5 Figure 4 is a diagram showing details of an embodiment of a method according to the 

present invention as ^plied to selection of a self-evolving polymerase. 

Figure 5 is a diagram showing details of an embodnnent of a method accordmg to the 
present invention to select for incorporation of novel or unusual substrates. 

Figure 6 is a diagram showing selection of RNA having (intermolecular) catalytic 
10 activity using the methods of our mventioiL 

Figure 7 is a diagram shoAwng a model of a Tag-DNA complex. 
Figure 8: A: General scheme of a cooperative CSR reactioiL 

15 Nucleoside diphosphate kinase (ndk) is ejcpressed from a plasmid and converts 

deoxmucleoside diphosphates which are not substrates for Taq polymaase into 
deoxmucleoside triphosphates which are. As soon as ndk has produced sufficient 
amounts of substrate, Taq can replicate the ndk gene. 

20 B: Bacterial ceUs expressmg wild-type ndk (0.8kb) or an inactive truncated 

fragment (0.5kb) are mixed 1:1 prior to emulsification. In solution, the shorter truncated 
fragment is amplified preferentially. In emulsuon, tiiere is predominantiy amplification of 
the wt ndk gene and only weak amplification of tiie truncated fiagment (arrow) 
mdicatmg that in emulsion only active ndk genes producing substrate are amplified. M: 

25 flaeHI^Xl 74 marker 
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Detailed Description of the iNrvENnoN 

The practice of the present invention will employ, unless otherwise indicated, 
conventional techniques of chemistry, molecular biology, microbiology, recombinant DNA 
and immunology, which are within the capabilities of a person of ordinary skill in the art 
5 Such techniques are explained in the literature- See, e.g., J. Sambrook, E. F. Fritsch, and T. 
Maniatis, 1989, Molecular Cloning: A Laboratory Manual, Second Edition, Books 1-3, Cold 
Spring Harbor Laboratory Press; B. Roe, L Crabtree, and A. Kahn» 1996, DNA Isolation and 
Sequencing: Essential Techniques, John Wiley & Sons; J. M. Polak and James O'D. McGee» 
1990, In Situ Hybridization: Principles and Practice; Oxford Universily Press; M. J. Gait 
10 (Editor), 1984, Oligonucleotide Syntiiesis: A Practical Approach, M Press; and, D. M. J. 
Lilley and J, E. Dahlberg, 1992, Mefliods of Bazymology: DNA Structure Part A: Synthesis 
and Physical Analysis of DNA Methods in Enzymology, Academic Press. Eadi of these 
general texts are herein incorporated by reference. 

Compartmentalised Self Repucation 

15 

Our invention describes a novel selection technology, which we call CSR 
(compartmentalised self-replication). It has the potential to be expanded into a geneiic 
selection system for catalysis as well as macromolecular interactions. 

•20 In its simplest form CSR involves the segregation of genes coding for and directing 

the production of DNA polymerases within discrete, spatially separated, aqueous 
compartments of a novel heat-stable water-in-oil emulsion. Provided with nucleotide 
triphosphates and appropriate flanking primers, polymerases replicate only their own genes. 
Consequently, only genes encoding active polymerases are replicated, while inactive variants 
25 that cannot copy their genes disappear from the g&xe pool. By analogy to biological systems, 
among differentially adapted variants, the most active (the fittest) produce the most 
"offspring", hence directiy correlating post-selection copy number with enzymatic turn-over. 



30 



CSR is not limited to polymerases but can be q[iplied to a wide variety of en2ymatic 
transformations, built around the "repUcase engine'*. For example, an cazyme "feeding" a 
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polymerase wbich in turn replicates its gene may be selected. More complicated coupled 
cooperative reaction schemes can be envisioned in which several enzymes either produce 
i^plicase substrates or consume replicase inhibitors. 

5 Polymerases occupy a central role in g«iome maintenance, transmission and 

expression of genetic information. Polymerases are also at the heart of modem biology, 
^bling core technologies such as mutagenesis, cDNA libraries, sequencing and the 
polymerase chain reaction (PGR). However, commonly used polym^ases ftequentiy suflBar 
from serious shortcomings as ikcy are used to perform tasks for which nature had not 

10 optimized them. Indeed, most advances have hem peripheral, includmg the use of 
polymerases from different organisms, improved buffer and additive systems as well as 
enzyme blends. CSR is a novel selection system ideally suited for the isolation of '*designef 
polymerases for specific q)plications. Many features of polymerase function are open to 
•improvemraf' (e.g. processivity, substrate selection etc.). Furthermore, CSR is a tool to 

IS study polymerase function, e.g. to probe immutable regions, study components of the 
repUsome etc. Moreova:, CSR may be used for shotgun fimctional clonmg of polymerases, 
straight from diverse, uncultured microbial populations. 

CSR represents a novel principle of repertoue selection of polypeptides. Previous 
20 approaches have featured various "display" methods in which phenotype and genolype 
(polypeptide and encoding gene) are Imked as part of a "genetic package" containing the 
encoding gene and displaying the polypeptide on the "outside". Selection occurs via a step of 
afiBnity purification after which surviving clones are grown (amplified) in ceUs for further 
rounds of selection (with resulting biases in growth distorting selections). Further distortions 
25 result from differences in the display efficiencies between different polypeptides. 

In another set of methods both polypeptide and encoding gene(s) are ^'packaged" within a cell. 
Selection occurs in vivo through the polypeptide modifying the cell in such a way fliat it 
acquires a novel phenotype, e.g. growth in presence of an antibiotic. As the selection pressure 
30 is applied on whole cells, such q)proaches tend to be prone to the generation of false 
positives. Furthermore, in vivo complCTientation strategies are Iknited in that selection 
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conditions, and hence selectable phenotypes, cannot be freely chosen and are further 
constrained by limits of host viability. 

In CSR, there is no direct physical linkage (covalent or non-covalent) between polypeptide 
5 and encoding grae. More copies of successful genes are "groW directly and in vitro as part 
of the selection process. 

CSR is explicable to a broad spectrum of DNA and SNA polymerases, indeed to all 
polypeptides (or polynucleotides) involved in replication or gene expression. CSR can also be 
10 applied to DNA and KNA ligases assembling then: genes from oligonucleotide fre^pients. 

CSR is tiie only selection system in vMch the turn-over rate of an enzyme is directly 
linked to Ae post-selection copy-number of its encoding gene. 

IS There is great interest in polynucleotide polymers with altered bases, altered su^rs or 

even backbone chemistries. However, solid-phase syntiiesis can usually only provide 
relatively short polymers and naturally occurring polymerases unsurprisingly incorporate most 
analogues poorly. CSR is ideally suited for the selection of polymerases more tolerant of 
mmatural substrates in order to prepare polynucleotide polymers with novel properties for 

20 chemistry, biology and nanotechnology (e.g. DNA wires). 

Finally, the heat-stable emulsion developed for CSR has applications on its own. Witii 
> 10^ microcompartments/ml, emulsion PCR (ePCR) offers the possibility of parallel PGR 
multiplexing on a unprecedented scale with potential applications from gene linkage analysis 
to genomic repertoire construction directiy from single cells. It may also have applications for 

25 large-scale diagnostic PCR applications like *TDigital PCR" (Vogelstem & Kinder (1999), 
PNAS, 96, 9236-9241). Compartmentalizing mdividual reactions can also even out 
competition among dififerent gene segments that are amplified in either multiplex or random 
primed PCR and leads to a less biased distribution of amplification products. ePCR may thus 
provide an altamative to whole genome DOP-PCR (and related metiiodologies) or mdeed be 

30 used to make DOP-PCR (and related methodologies) more effective. 
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The selection system according to our invention is based on self-replication in a 
compartmentalised system. Our invention relies on the fact that active replicases are able to 
replicate nucleic acids Qn particular their coding sequences), while inactive replicases cannot. 
Thus, in the methods of our invention, we provide a compartmentalised system where a 

5 xeplicase in a compartment is substantially unable to act on any template other than the 
tenq>lates within that compartment; in particular, it cannot act to replicate a template within 
ai^ other compartment In highly preferred embodiments, the template nucleic acid within the 
compartment encodes tiie replicase. Tlius» the leplicase cannot replicate anything other than its 
coding sequence; the replicase is therefore ''linked'' to its coding sequence. As a result, in 

10 higjily piefened embodiments of our invention, the final concentration of the coding sequence 
Oi.e. copy number) is dependent on the activity of the enzyme encoded by it 

Our selection system as applied to selection of replicases has the advantage in that it 
Unks catalytic turnover OWKm) to flie post-selection copy-number of the gene encoding the 
catalyst Thus, compartmentalisation offers the possibility of linking genotype and phenotype 
IS of a replicase enzyme, as described in furtiier detail below, by a coupled enzymatic reaction 
involving the replication of the gene or genes of the enzyme(s) as one of its steps. 

The methods of our invention preferably make use of nucleic add litories, the nature - 
and construction of which will be explained in greater detail below. The nudeic add library 
comprises a pool of di^rent nucleic acids, mraibers of that encode variants of a particular 

20 entity (the entity to be selected). Thus, for example, as used to select for replicases, the 
methods of our invention employ a nucleic acid library or pool having membos, which 
encode the replicase or variants of the replicase. Each of the entities encoded by the various 
members of the library will have different properties, e.g., varying tolerance to heat or to the 
presence of inhibitory small molecules, or tolerance for base pair mismatches (as explained in 

25 further detail below). The population of nucleic add variants therefore provides a starting 
material for selection, and is in many ways analogous to variation in a natural population of 
organisms caused by mutatioiL 

According to our invention, the different members of the nucleic add library or pool 
are sorted or conqsaitmentalised into many compartments or microciQisules. In preferred 
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embodimmts, each compartment contains substantially one nucleic add member of the pool 
(in one or several copies). In addition, the compartment also comprises the polypeptide or 
polynucleotide (in one or preferably several copies) encoded by that nucleic acid mmiber 
(whether it is a.replicase, an agent, a polypeptide, etc as discussed below). Ihe nature of these 

5 compartments is such that minimal or substantially no interchange of macromolecules (such 
as nucleic acids and polypeptides) occurs between different compartments. As e}q>lained in 
farther detail below, highly preferred embodiments of our mvention make use of aqueous 
compartments within water-in-oil emulsions. As explained above, any replicase activity 
present in the compartm^ (whether ^bited by the leplicase, modified by an agent, or 

10 exhibited by the polypq>tide acting in conjunction with another polypeptide) can only act on 
&e tenqplate wittiin the compartment 

The conditions within the conq)artments may be varied in order to select for 
polypeptides active under these conditions. For example, where leplicases are selected, the 
compartments may have an increased temperature to select for replicases with higher thermal 
15 stability. Furthermore, usmg the selection methods desoibed here on fusion proteins 
comprismg themostable replicase and a protein of interest will allow the selection of 
thermally stable proteins. 

A method for the incorporation of thermal stability into otherwise labile proteins of 
20 commercial importance is desirable with regards to their large-scale production and 
distribution. A reporter system has been described to improve protein folding by expressing 
proteins as fusions with green fluorescent protein (GFP) (Waldo et al (1999), Nat Biotechnol 
17: 691-695). The function of tiie latter is related to the productive foldmg of the fiised protein 
influmcing folding and/or functionality of the GFP, enablmg the directed evolution of 
25 variants with improved folding and expression. According to this aspect of our invention, 
protems are fiised to a thermostable replicase (or an agent prornoting replicase activity) and 
selecting for active fusions in emulsion as a method for evolving proteins with increased 
tiiermostability and/or solubility. Unstable variants of the fusion partner are e3q)ected to 
aggregate and precipitate prior to or during thermal cycling, thus oompromismg replicase 
30 activity within respective compartments. Viable fusions will allow for self-«mplification in 
emulsion, with tibie turn-over rate being linked to tiie stability of the fiisi on partner. 
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In a related approach, novel or increased ch^ronin activity may be evolved by 
coexpression of a library of ch^rones together with a polymerase-polypeptide fusion 
protein, in which the protein moiety misfolds (under the selection conditions). Replication of 
the gene(s) encoding the chaperonin can only proceed after chaperonin activity has rescued 
5 polymerase activity in the polymerase-polypq)tide fusion protein. 

Theimostability of an enzyme may be measured by conventional means as known in 
the art For example, the catalytic activity of the native enzyme may be assayed at a certain 
temperature as a benchmark. Enzyme assays are well known in the art, and standard assays 
have been established over the years. For example, incorporation of nucleotides by a 

10 polymerase is measured, by for example, use of radiolabeled dNTPs such as dATP and filt^ 
binding assays as known in the art The enzyme whose thermostability is to be assayed is 
preincubated at an elevated teixq)eratm:e and then its activity retained (for exanq)le, 
polymerase activity in the case of polymerases) is measured at a lower, optimum tenqperature 
and conq>ared to the benchmark. In tiie case of Tag polymerase, the elevated temperature is 

15 97.5**C; flie oirtimum tempemture is IT'C. Thermostability may be expressed in the fonn of 
half-life at tbe elevated temperature (i.e. time of incubation at higher temperature over vMch 
polymerase loses 50% of its activity). For exan^le, the thermostable replicases, fusion 
proteins or agents selected by our invention may have a half-life that is 2X, 3X, 4X, 5X, 6X, 
7X, 8X, 9X, lOX or more than the native enzyme. Most preferably, the tiiermostable 

20 replicases etc have a half-life that is llx or more when compared this way. Preferably, 
selected polymerases are preincubated at 95 °C or more, 97.5 ''C or more, 100 or more, 
105 ^^C or more, or 1 10 or more. Thus, m a highly preferred embodiment of our invention, 
we provide polymerases with increased thermostability which display a half life at 97.5 °C 
that is 1 IX or more than the corresponding wild type (native) enzyme. 

25 Resistance to an inhibitory agent, such as heparin in the case of polymerases, may also 

be assayed and measured as above. Resistance to inhibition may be expressed in terms of the 
concentration of the inhibitory fi»:tor. For example, in preferred embodiments of the 
invention, we provide heparin resistant polymerases that are active in up to ai concentration of 
heparin between 0.083units/^l to 033 units/|il. For comparison, our assc^ indicate that the 
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concentration of heparin which inhibits native (wild-^^) Tag polymerase is in the region of 
between O.OOOS to 0.0026 units/^l. 

Resistance is convenientiy expressed in terms of tiie inhibitor concentration, vMoh is 
found to irihibit'tiie activity of the selected leplicase, fusion protein or agent, compared to the 

5 concentration, yMch is found to inhibit the native enzyme. Thus, the resistant repUcases, 
fusion proteins, or agwts selected by our uivention may have lOX, 20X, 30X, 40X, 50X, 60X, 
70X, 80X, 90X, lOOX, llOX, 120X, 130X, 140X, 150X, 160X, 170X, 180X, 190X, 200X, or 
more resistance compared to the native enzyme. Most prrferably, the resistant replicases etc 
have 130x or more fold increased resistance vAjusa compared this way. The selected replicases 

1 0 etc preferably have 50% or more, 60% or more, 70% or more, 80% or more, 90% or more, or 
evOT 100% activity at the concentration of the inhibitory factor. Furthermore, the 
compartments may contain amounts of an mhibitory agent such as heparin to select for 
replicases having activity under sudi conditions. 

As explained below, the metiiods of our invention may be used to select for a pair of 
1 5 interacting polypeptides, and the conditions witiiin tiie compartments may be altered to choose 
polypeptides capable of acting under these conditions (for example, high salt, or elevated 
temperature, etc.). Hie methods of our invention may also be used to select for the folding, 
stability and/or solubility of a fused polypeptide acting under these conditions (for example, 
high salt, or elevated temperature, chaotropic agents etc.). 

20 The mefliod of selection of our present invention may be used to select for various 

replicative activities, for example, for polymerase activity. Here, the r^licase is a polymerase, 
and the catalytic reaction is the replication by the polymerase of its own gene. Thus, defective 
polymerases or polymerases vMch are inactive under the conditions und^ which the reaction 
is carried out (the selection conditions) are unable- to amplify their own genes. Similarly, 

25 polymerases vMch are less active will replicate their coding sequences withm their 
compartments more slov\dy. Accordingly, tiiese genes will be under-represented, or even 
disappear fiom the gene pool. 



wo 02/22869 



PCT/GBOl/04108 



17 

Active polymerases, on the other hand, are able to replicate their own genes, and ttie 
resulting copy number of these genes will be inCTeased. In a preferred embodiment of the 
mvention, the copy nimiber of a gene within the pool will be bear a direct relation to the 
activity of the encoded polypeptide under the conditions under which the reaction is carried 
5 out. In this preferred embodiment, the most active polymerase will be most represented in the 
final pool (Le., its copy number within the pool will be highest). As will be appreciated, this 
enables easy cloning of active polymerases over inactive ones. The method of our invention 
therefore is able to directly liiik the turnover rate of tiie ^izyme to the resulting copy-number 
of the gene encoding it 

10 As an example, the method may be applied to the isolation of active polymerases 

(DNA-, KNA-polymerases and reverse transoiptases) fix>m thermophilic organisms. Briefly a 
thermostable polymerase is expressed intracellulaiily in bacterial cells and these are 
compartmentalised (e.g. in a water-oil emulsion) in appropriate bufiG^ together with 
appropriate amounts of the fi>ur dNTPs and oligonucleotides priming at either end of the 

IS polymerase gene or on plasmid s^juences flanking tiie polymerase gene. The polymerase and 
its gene are released fix)m the cells by a temp^atuze step tiiat lyses the cells and destroys 
CTiqnEnatic activities associated with the host celL Polymerases fix>m inesophilic organisms (or 
less thermostable polymerases) may be expressed in an analogous way excqpt cell lysis should 
either proceed at ambient temperature (e.g, by expression of a lytic protein (e.g. derived &om 

20 lytic bacteriophages, by detergent mediated lysis (e.g. Bugbuster™, commercially available) 
or lysis may proceed at elevated temperature in the presence of a polymerase stabilizing agent 
(e.g. high concentrations of proline (see example 27) in the case of KJenow or trehalose in the 
case of RT). In such cases background polymerase activity of die host strain may interfere 
with selections and it may be prefa:able to make use of mutant strains (e.g. polA'). 

25 Alternatively, polymerase genes (eitiier as plasmids or linear fragments) may be 

compartmentalised as. above and the polymerase expressed in situ within the compartments 
using in vitro transcription translation (jM% followed by a temperature step to destroy 
enzymatic activitiies associated with the in vitro translation extract Polymerases from 
mesophilic organisms (or less thermostable polymerases) may be expressed in situ in an 

30 analogous Vfay except in order to avoid enzymatic activities associated witii the in vitro 
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translation extract it may be preferable to use a translation extract reconstituted from defined 
purified components like die PURE system (Shimizu et al (2001) Nat. Biotech., 19:75 1). 

FCR thermocycling then leads to the aiiq)lification of the polymerase g^ies by the 
polypeptides they encode, i.e. only genes ^coding active polymerases, or polymerases active 
S under the chosen conditions ivill be an^lified. Furthermore, die copy number of a polymerase 
gene X after self-an^lification wfll be directly proportional to die catalytic activity of the 
polymerase X it encodes, (see Figures 1 A and IB). 

By varying the selection conditions i^thin the compartment, polymerases or other 

10 replicases witii desired properties may be selected usmg the methods of our inventioru Thus, 
by e3q)0sing repertoires of polymerase genes (diversified through targeted or random 
mutation) to self-amplification and by altering the conditions under ^v^ch self-amplification 
can occur, die system can be used for die isolation and engineering of polymerases with 
altered, enhanced or novel properties. Such enhanced properties may include increased 

15 thermostability, increased processivity, increased accuracy (better proofi?eading), iiicreased 
incorporation of un&vorable substrates (e.g., ribonucleotides, dye-modified, general bases 
such as 5-nitroindole, or other unusual substrates such ais pyrene nucleotides (Matray & Kool 
(1999), Nature 399, 704-708) (Fig. 3) or resistance to inhibitors (e.g. Heparin in clinical 
samples). Novel properties may be the incorporation of unnatural substrates (e.g. 

20 ribonucleotides), bypass reading of damaged sites (e.g. abasic sites (Paz-Elizur T. et al (1997) 
Biochemistry 36, 1766), thymidine-dimers (Wood R.D. (1999) Nature 399, 639), hydantoin- 
bases (Duarte Y etal (1999) Nucleic Acids Res. 27, 496) and possibly even novel chemistries 
(e.g. novel backbones such as PNA (Nielsen PE. Cwr Opin BiotechnoL 1999;10(l):71-5) or 
sulfone (Benner SA et al.Pure Appl Chem. 1998 Feb;70(2):263-6) or altered sugar chemistries 

25 (A. Eschenmoser, Science 284, 2118-24 (1999)). It may also be used to isolate or evolve 
&ctors that enhance or modify polymerase fimction such as processivity &ctors (like 
diioredoxin in die case of T7 DNA po^erase (Doublie S. et al (1998) Nature 391, 251)) 

However, other enzymes besides replicases, such as telom^ases, helicases etc may 
30 also be selected according to our invention* Thus, telomerase is expressed in situ (in 
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compartments) by for example in vitro translation together with Telomerase-RNA (either 
added or transcribed in situ as well; e.g. Bachand et a/., (2000) RNA 6:778-784)- 

Compartments also contain Tag Pol and dNTPs and telomere specific primers. At low 
temperature Taq is inactive but active telomerase will append telomeres to its own encoding 
5 gene (a linear DNA fiagment with appropriate ends). After the telomerase reaction, 
1faermo(^cling only amplifies active telomerase encoding genes. Diversity can be introduced 
in telomerase gene or RNA (or both) and could be targeted or random. As applied to selection 
of helicases, the selection method is essentially the same as described for telomerases, but 
helicase is used to unwind strands rather than heat denaturation 

10 The methods of our invention may also be us^ to select for DNA repair enzymes or 

translesion polymerases such as Rcoli Pol IV and Pol V. Here, damage is introduced into 
primers (targeted chemistry) or randomly by mutagen treatment (e.g. UV, mutag^c 
chCTiicals etc.). This allows for selection for enzymes able to repair pruners reqmred for 
replication or own gene sequence (information retrieval) or, resulting in improved 

15 ••repahases" for gene therapy etc. 

The methods of our invention may also be used in its various embodiments for 
selecting agents capable of directly or indirectly modulating replicase activity. In addition, the 
invention may be used to select for a pair of polypeptides capable of interacting, or for 
selection of catalytic nucleic acids such , as catalytic RNA (ribo2ymes). These and other 
20 embodiments will be explained in finther detail below. 

NUCLEIC ACID PROCESSING ENZYMES 

As referred to herein, a nucleic acid processing enzyme is any enzyme, which may be a 
ITOtem wzyme or a nucleic acid enzyme, which is capable of modifying, extending (such as 
by at least one nucleotide), amplifying or otherwise influencing nucleic acids such as to render 
25 tiie nucleic acid selectable by amplification in accordance with the present invention. Such 
enzymes, therefiire possess an activity which results in, for example^ amplification, 
statnlisation, destabOisation, hybridisation or denaturation, rq)lication, protection or 
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deprotection of nucleic acids, or any other activity on die basis of which a nucleic acid can be 
selected by amplification. Examples include helicases, telomerases, ligases, recombinases, 
integrases and leplicases. Replicases are preferred 

Re^UCASE/REFUCATION 

5 As used here, the term "plication" refers to the template-dependent copying of a • 

nucleic acid sequence. Nucleic acids are discussed and exemplified below. In general, the 
product of flie replication is anothor nucleic acid, whether of the same spedes, or of a 
diflferent species. Thus, included are the replication of DNA to produce DNA, replication of 
DNA to produce RNA, replication of RNA to produce DNA and replication of KNA to 
10 produce RNA. '"Replication" is therefore intended to enconq)ass processes such as DNA 
rq)lication, polymerisation, ligation of oligonucleotides or polynucleotides (e.g. tri-nucleotide 
(triplet) S' triphosphates) to form longer sequences, transcription, reverse transcription, etc. 

The teim '^pUcase^Ms intended to mean an ens^e having cat^^ 
capable of joining nucleotide, building blocks togetiier to form nucleic add sequraces. Such 

15 nucleotide building blocks include, but are not limited to, nucleosides, nucleoside 
triphosphates, deoxynucleosides, deoxynucleoside triphosphates, nucleotides (comprising a 
nitrogen-containing base such as adenine, guanine, cytosine, uracil, thymine, etc., a 5-carbon 
• sugar and one or more phosphate groups), nucleotide triphosphates, deoxynucleotides such as 
deoxyadenosine, deoxythymidine, deoxycytidine,. deoxyuridine, deoxyguanidine, 

20 deoxynucleotides triphosphates (dNTPs), and synthetic or artificial analogues of these. 
Building blocks also include oligomers or polymers of any of the above, for example, 
trinucleotides (triplets), oligonucleotides and polynucleotides. 

Thus, a replicase may extend a pre-existiing nucleic acid sequence (primer) by 
incorporating nucleotides or deoxynucleotides. Such an activity is known in the art as 
25 "'polymerisation", and tiie en2^es, which cany this out, are known as ''polymerases". An 
example of such a polymerase replicase is DNA polymerase, which is capable of replicating 
DNA. Hieprimermay be the same diemically, or different fixmi, the extended sequence (for 
^cample^ mammalian DNA polymerase is known to ext^ a DNA sequence fix>m an RNA 
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primer). The term replicase also includes those enzymes which join together nucleic acid 
sequences, whether polymers or oligomers to form longer nucleic acid sequences. Such an 
activity is exhibited by the ligases, which ligate pieces of DNA or KNA. 

The replicase may consist entirely of replicase sequence, or it may comprise a 
S replicase sequence linked to a heterologous polypeptide or oth^ molecule such as an agent by 
chemical means or in flie form of a fusion protein or be assembled fiom two or more 
constituent parts. 

Preferably, the replicase according to the invention is a DNA polym^ase, RNA 
polymerase, reverse transcriptase, DNA ligase, or RNA ligase. 

10 Preferably, the rq>licase is a thennostable replicase. A "thermostable'' replicase as 

used here is a replicase, which demonstrates significant resistance to thermal denaturation at 
elevated tempCTatures, typically above body temperature (STC). Prefeably, such a 
temperature is in the range 42^C to 160^C, more preferably, between 60 to lOO^'C, most 
preferably, above 90**C. Compared to a non-thermostable replicase, the thermostable replicase 

15 displays a significantly increased half-life (time of incubation at elevated temperature that 
results in 50% loss of activity). Preferably, the thermostable replicase retains 30% or more of 
its activity after incubation at the elevated temperature, more preferably, 40%, 50%, 60%, 
70% or 80% or more of its activity. Yet more preferably, the replicase retams 80% activity. 
Most preferably, the activity retamed is 90%, 95% or more, even 100%. None-Aermostable 

20 replicases would exhibit little or no retention of activity after similar incubations at the 
elevated temperature. 

POLYMERASE 

An example of a rq)licase is DNA polymerase. DNA polymerase enzymes are 
naturally occurring intracellular enzymes, and are used by a cell to replicate a nucleic add 
25 strand u^ng a template molecule to manu&cture a complementary nucleic acid strand. 
Enzymes having DNA polymerase activity catalyze the formation of a bond between the 3* 
hydroxyl group at the growing end of a nucleic acid primer and the S' phosphate group of a 
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nucleotide triphosphate. These nucleotide triphosphates are usually selected from 
deoxyadenosine triphosphate (A), deoxythymidine triphosphate (T), deoxy<^dine 
triphosphate (C) and deoxyguanosine triphosphate (G). However, DNA polymerases may 
incorporate modified or altered versions of these nucleotides. The order in which the 

5 nucleotides are added is dictated by base pairing to a DNA template strand;^ 

is accomplished through **canonical'* hydrogen*-bonding (hydrogen-bonding between A and T 
nucleotides and G and C nucleotides of opposing DNA stiandsX althou^ non-canonical base 
pairing, such as G:U base pairing, is known in the art See e.g., Adams et a/.. The 
Biochemistry of tiie Nucleic Acids 14-32 (1 1th ed. 1992). The in-vitro use of enzymes having 

10 DNA polymsnse activity has in recent years become more conmion in a variety of 
biochemical q>plications including cDNA synOiesis and DNA sequendng reactions (see 
Sambrook e al., (2nd ed Cold Spring Harbor Laboratory Press, 1989) hereby incorporated by 
reference herdn), and amplification of nucleic adds by methods such as the polymerase diain 
reaction (PGR) (MulUs et al., U.S. Pat. Nos. 4,683,195, 4,68332, and 4,800,159, hereby 

15 incorporated by reference herein) and RNA transoiption-mediated amplification mediods 
(e.a., Kadan et al, PCT PubUcationNo. WO91/01384). 

Methods such as PGR make use of cycles of primer extension throu^ the use of a 
DNA polymerase activity, followed by thermal denaturation of the resulting double-stcanded 
nucleic acid in order to provide a new template for anoth^ round of primer annealing and 

20 extension. Because the high temperatures necessary for strand denaturation result in flie 
irreversible inactivations of many DNA polymerases, the discovery and use of DNA 
polymerases able to remain active at temperatures above about 37°C to 42°C (thermostable 
DNA polymerase enzymes) provides an advantage in cost and labor efficiency. Thermostable 
DNA polymerases have been discovered in a number of thermophilic organisms including, 

25 but not limited to Thermus aqtutticus, Ihermus thermophilus, and species of the Bacillus^ 
Thermococcus^ Sulfolobus, Pyrococcus genera. DNA polymerases can be purified directiy 
fiom these thermophilic organisms. However, substantial increases in the yield of DNA 
polymerase can be obtained by first cloning the gwe encoding the enzyme in a multicopy 
expression vector by leconibinant DNA technology methods, inserting the vector into a host 

30 cell strain capable of e^qiressing the ^i^me, culturing the vector-containing host cells, then 
extracting the DNA polymerase fiom a host cell strain which has esqnressed the enzyme. 
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The bacterial DNA polymerases that have been characterized to date have certain 
patterns of similarities and differences which has led some to divide these enzymes into two 
groups: those v^ose genes contain introns/inteins (Class B DNA polymerases), and fliose 
whose DNA polymerase genes are roughly similar to that of E coli DNA polymerase l and do 
S not contain intions (Class A DNA polymeiases). 

Several Class A and Class B Ifaennostable DNA polymerases derived fsom 
thermophilic organisms have been cloned and expressed. Among the class A em^es: 
Lawyer, et al, J. Biol Chem. 264:6427-6437 (1989) and Gelfund et al, U.S. Pat No, 
5,079,352, report the cloning and expression of a fiill length thermostable DNA polymerase 

10 derived fiom Thermos aquaticus (Tag). Lawyer et al., m PCR Methods and Applications, 
2:275-287 (1993), and Barnes, PCX PubUcation No. WO92/06188 (1992), disclose the 
cloning and expression of truncated versions of the same DNA polymerase, while Sullivan, 
BPO Publication No. 0482714A1 (1992), r^rts cloning a mutated version of flie Taq DNA 
polymerase. Asakura et al., J. FermenL Bioeng. (Jq)an), 74:265-269 (1993) have reportedly 

15 cloned and e^ressed a DNA polymerase from Thermus thermophilus. Gelfund et al., PCT 
Publication No. WO92/06202 (1992), have disclosed a purified thermostable DNA 
polymerase fiom Thermosipho qfiicanus. A th^mostable DNA polymerase fix>m Thermus 
flaws is xepoTbssd by AHmietganov and Vakbitov, Nucleic Acids Res., 20:5839 (1992). 
Uemori et al, J. Biochem. 1 13:401-410 (1993) and EPO PubUcation No. 0517418A2 (1992) 

20 have reported cloning and expressing a DNA polymerase fix)m the thermophilic bacterium 
Bacillus caldotenax. Ishino et a/., J^anese Patent Application No. HEX 4[1992]-131400 
(publication date Nov. 19, 1993) report cloning a DNA polymerase firom Bacillus 
stearothermophilus. Among the Class B enzymes: A recombinant thermostable DNA 
polymerase from Thermococcus litoralis is reported by Comb et al., EPO Publication No. 0 

25 455 430 A3 (1991), Comb et al, EPO PubUcation No. 054792OA2 (1993), and Perler et al., 
Proc, Natl Acad Sci (USA), 89:5577-5581 (1992). A cloned thermostable DNA polymerase 
from Sulfolobus solofatarius is disclosed in Pisani et al. Nucleic Acids Res. 20:271 1-2716 
(1992) and m PCT Publication W093/25691 (1993). Hie thermostable enzyme of Pyrococcus 
fariosus is disclosed in Uemori et al.. Nucleic Acids Res., 21:259-265 (1993), vMc a 

30 recombinant DNA polymerase is derived fiom Pyrococcus sp. as disclosed in Comb et al., 
EPO PubHcation No. 0547359A1 (1993). 
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Many thermostable DNA polymerases possess activities additional to a DNA 
polymerase activit)^ these may include a 5'-3' exonuclease activity and/or a 3'-5' exonuclease 
activity. The activities of 5'-3' and 3*-5' ^onucleases are well known to those of ordinary 
skill in the art The 3'-S' exonuclease activity improves the accuracy of the newly-synthesized 

5 strand by removing incorrect bases that may have been incorporated; DNA polymerases in 
which such activity is low or absent, reportedly including Taq DNA polymerase, (see Lawyer 
et al., J. Biol Chem. 264:6427-6437), have elevated mor rates in the incorporation of 
nucleotide residues into the primer extension strand In sqpplications such as nucldc acid 
anq)lification ptocedures in which the rqplication of DNA is often ^metric in relation to the 

10 number of primer extension cycles, such errors can lead to serious arti&ctual problems such 
as sequence heterogeneity of the nucleic add anoqplification product (amplicon). Thus, a 3'-5' 
exonuclease activity is a desired diaracteristic of a thermostable DNA polymerase used for 
sudi purposes. 

By contrast; the S'-3' exonuclease activity often present in DNA polymerase enzymes 
15 is often undesired in a particular application smce it may digest nucldc adds, including 
primers, tiiat have an uiq^otected S' end. Thus, a thermostable DNA polymerase with an 
attenuated S'-3' exonuclease activity, or in whidi such activity is absent, is also a desired 
characteristic of an enzyme for biodiemical i^lications. Various DNA polymerase enzymes 
have been desoibed where a modification has been introduced in a DNA polymerase, vMck 
20 accomplishes this object For example, the Ktenaw firagment of K coli DNA polymerase I can 
be produced as a proteolytic fiagment of the holoenzyme in v^ch the domain of tiie protein 
controlling the 5'-3' exonuclease activity has been removed. The Klenow fragment still 
retains the polymerase activity and the 3'-5' exonuclease activity. Barnes, supra, and Gelfimd 
et al., U.S. Pat No. 5,079,352 have produced 5'-3' exonuclease-deficient recombinant Taq 
25 DNA polymerases. Ishino et al,, EPO Publication No. 0517418A2, have produced a 5'-3*. 
exonuclease-defident DNA polymerase derived from Bacillus caldotenax. On tiie other hand, 
polymerases lacking the 5'-3' exonuclease domain ofim have reduced processivity. 
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UGASE 

DNA strand breaks and gaps are generated transiently during replication, repair and 
recombination. In mammalian cell nuclei, rejoining of such strand breaks depends on several 
different DNA polymerases and DNA ligase enzymes. The mechanism for joining of DNA 
5 strand interruptions by DNA ligase enzymes has been widely described The reaction is 
initiated by the formation of a covalent enzyme-adenylate complex. Mammalian and viral 
DNA ligase enzymes employ ATP as cofactor, whereas bacterial DNA ligase enzymes use 
NAD to generate Ae adeir^yl group. In the case of ATP-utiUsing liaises, the ATP is cleaved 
to AMP and pyrophosphate with the adraylyl residue linked by a phosphoramidate bond td 

10 tiie 8-amino group of a specific lysine residue at the active site of the prdtem (Gucq>ort, R. L, 
et d., FNM 68:2559-63 (1971)). Reactivated AMP leadue of tiie DNA Ugase-adenylate 
intermediate is transferred to tiie 5' phosphate terminus of a single strand break in double 
stranded DNA to generate a covalent DNA-AMP complex witti a 5'-5' phosphoanhydride 
bond. Itiis reaction intermediate has also been isolated for microbial and mammalian DNA 

15 ligase enzymes, but is shorter lived tiian tiie adenylylated enzyme. In the final step of DNA 
ligation, unadenylylated DNA ligase enzymes requhed for the generation of a phosphodiester 
bond catalyze displacement of die AMP residue through attack by the adjacent S'-hydroxj^ 
group on the adenylylated site. 

The occurrence of three different DNA ligase enzymes, DNA Ligase I, n and m, is 
20 established previously by biochemical and immunological characterization of purified 
enzymes (Tomkinson, A. E. et al., J. Biol Chem., 266:21728-21735 (1991) and Roberts, R, 
et al., J. Biol Chem, 269:3789-3792 (1994)). 

AMPLmCAHON 

The methods of our invention involve the tenq)lated amplification of desired nucleic 
25 acids. "Amplification" refers to the increase in the number of copies of a particular nucleic 
acid fiagment (or a portion of tins) resulting eith^ Sxm an enzymatic chain reaction (such as 
a polymerase chain reaction, a ligase chain reaction, or a self-sustained sequence replication) 
or fix)m the replication of all or part of die vector into ^vhich it has been doned Prrferably, 
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the amplification according to our invention is an exponential anq>lificatLon, as exhibited by 
for exanq>le the polymerase chain reaction. 

Many target and signal amplification metiiods have bem described m flie literature, for 
example, general reviews of fliese methods m Landegr^ U,, et al.. Science 242:229-237 

5 (1988) and Lewis, R., Genetic Engineermg News 10:1, 54-55 (1990> These amplification 
mdfaods may be used in tiie methods of our invention, and inchide polymerase chain reaction 
OPCR), PGR in situ, ligase amplification reaction (LAR), ligase hybridization, Q 
bacteriophage teplicase, transcription-based amplification system (TAS), gmomic 
amplification with transcript sequencing (GAWTS), nucleic acid sequence-based 

10 amplification (NASBA) and in situ hybridization. 

Polymerase Chain Reaction (?CR) 

PGR is a nucleic acid amplification mettiod described inter alia m U.S. Pat Nos. 
4,683,195 and 4,683,202^ PGR consists of repeated cycles of DNA polymerase generated 
primer extension reactions. The target DNA is heat denatured and two oligonucleotides, 

15 which bracket the target sequence on opposite strands of the DNA to be amplified, are 
hybridized. These oligonucleotides become primers for use with DNA polymerase. The DNA 
is copied by primer extension to make a second copy of both strands. By repeating the cycle of 
heat denaturation, primer hybridization and extension, the target DNA can be aruplified a 
million fold or more in about two to four hours. PGR is a molecular biology tool, which must 

20 be used in conjunction with a detection technique to determine the results of amplificatioiL An 
advantage of PGR is that it increases srasitivity by amplifying the amount of target DNA by 1 
million to 1 billion fold in ^proximately 4 hours. 

Ihe polymerase diain reaction may be used in the selection methods of our invention 
as follows. For example, PGR may be used to select fer variants of Tag polym^ase having 
25 polymerase activity. As described in fiiriher detail above, a library of nucleic adds eadi 
encoding a replicase or a variant of the replicase, for exasaple. Tag polymerase, is generated 
and subdivided into compartments . Each compartment comprises substantially one member of 
tiie library togetiier witii the replicase or variant encoded by that membo:. 



wo 02/22869 



PCT/GBOl/04108 



27 

The polymerase or variant may be expressed in vivo within a transformed bacterium or 
any other suitable expression host, for example yeast or insect or mammalian cells, and the 
repression host encapsulated within a compartmenL Heat or other suitable means is applied to 
disrupt the host and to release fbs polymerase variant and its encoding nucleic acid within the 
S compartment. In the case of a bacterial host, tuaoied expression of a lytic protein, for example 
protem E from 40C174, or use of an inducible X lysogen, may be employed for disnqjting flie 
bacterium. 

It will be clear tiiat the polymerase or other enzyme need not be aheterologous protem 
e3q>ressed m that host (e.g., a plasmi<Q, but may be e3q)iessed fiom a gene formmg part of the 

10 host genome. Thus, the polymerase may be for example an endogenous or native bacterial 
polymerase. We have shown that in the case of nucleotide diphosphate Idnase (ndk), 
Oogenous (uninduced) ^ression of ndk is sufficient to generate dNTPs for its own 
replication. Thus, the methods of selection according to our mvention may be employed for 
the direct functional cloning of polymerases and other en^mes from diverse (and unicultured) 

15 microbial populations. 

Alternatively^ the nucleic acid library may be compartmentalised together with 
components of an in vitro transcription/translation system (as described in fiirth^ detail in this 
document), and the polymerase variant expressed in vitro within the compartment. 

Each compartment also comprises components for a PGR reaction, for example, 
20 nucleotide triphosphates. (dNTPs), buffer, magnesium, and oligonucleotide primers. The 
oligonucleotide primers may have sequences corresponding to sequences flanking tiie 
polymerase gme (i.e., within the genomic or vector DNA) or to sequences within the 
polymerase gene, PGR thermal cycling is then initiated to allow any polymerase variant 
having polymerase activity to amplify the nucleic add sequence. 

25 Active polymerases will amplify thdr coiiesponding nucleic add sequences, while 

nucldc acid sequences encoding wealdy active or inactive polymerases will be weakly 
Implicated or not be replicated at alL In general, the final copy number of eadi member of tiie 
nucldc add library will be expected to be proportional to the level of activity of tiie 
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polymerase variant encoded by it Nucleic acids encoding active polymerases will be over- 
represented, and nucleic acids encoding inactive or weakly active polymerases will be under- 
lepresented. The resulting amplified sequences may then be cloned and sequenced, etc., and 
replication alnlity of each member assayed. 

5 As described in furthw detail elsewhae, the conditions within eadi compartment may 

be altered to select for polymerases active under tiiese conditions. For example, heparin may 
be added to tiie reaction mix to choose polymerases, which are redstant to hqparin. The 
temperature at whidi PCR takes place tosy be devated to select for heat resistant variants of 
potymerase. Ftntiiermore, polymerases may be selected vMch are citable of extending DNA 

10 sequences such as primers witti altered 3' ends or altered parts of tiie pacosx sequence. Ttie 
altered 3' ends or other alteraticms can mclude unnatural bases (alteied sugar or base 
moieties), modified bases (e.g. blocked 3' ends) or even primers witii altered backbone 
cbonistries (e.g. FNA primers). 

Reverse i ranscri ptase-PCR 

15 RT-PCR is used to amplify ENA taxgiAs. In tiiis process, the reverse transaiptase 

enzyme is used to convert RNA to complementary DNA (cDNA), wMdi can then be 
amplified using PCR. This metiiod has proven useful for its detection of RNA viruses. 

The mefliods of our invention may employ RT-PCR Thus, the pool of nucleic acids 
encoding flie replicase or its variants may be provided in the form of an RKA library. This 

20 library could be generated in vivo in bacteria, mammalian cells, yeast etc., vMdi are 
compartmenlalised, or by in-vitro transcription of compartmantalised DNA. The RNA could 
encode a co-compartmentalised replicase (e.g. reverse transcriptase or polymerase) that has 
been expressed in vivo (and released in emulsion along with tiie RNA by means disclosed 
below) or m viti». Oflier conqioneats necessary for amplification (polymerase and/or reverse 

25 transciqrtaseidNTPs, primers) are also conqwrtmentalisedUndCT^vensdectionpres 

the cDNA product of Ae tesvetse transcription reaction serves as a template for PCR 
ampUfication. As wiUi olha repUcation reactions Cm particular ndk in tiie Examples) tiie RNA 
may encode a range of enzymes feeding tiie reactioiL 
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Self-Sustained Sequence Replication (3SK) 

Self-sustained sequence replication (3SR) is a variation of TAS, which involves the 
isothermal amplification of a nucleic acid template via sequential rounds of reverse 
transcriptase (RT), polymerase and nuclease activities that are mediated by an enzyme 
5 cocktail and appropriate oligonucleotide primers (Guatelli et al. (1990) Proa Natl Acad Set 
USA 87:1874). Enzymatic degradation of the KNA of the RNA/DNA heteroduplex is used 
instead of heat denaturadon. RNAse H and all other enzymes are added to the reaction and all 
steps occur at the same temperature and without further reagent additions. Follov^dng this 
process^ amplifications of 10^ to 10^ have been achieved in one hour at 42^C. 

10 

The methods of our invention may therefore be eictended to select polymerases or replicases 
firom mesophilic organisms using 3SR isothermal amplification (Guatelli et al OuatelU et al. 
(1990) Proc. Natl Acad Set USA 87:7797; Compton (1991) Nature 7;350:91-92) instead of 
PGR thermocycling. As described above, 3SR involves the concerted action of two en:^es: 
15 an RNA polymerases as well as a reverse transcriptase cooperate in a coupled reaction of 
transcription and reverse transcription, leading to the simultaneous amplification of both RNA 
and DNA- Clearly, in this system self-amplification may be ^plied to either of the two 
enzymes involved or to both simultaneously. It may also include the evolution of the RNAse 
H activity either as part of the reverse transcriptase en2yme (e.g. HIV-1 RT) or on its own, 

. 20 Hie various aozymatic activities that define 3SR and related methods are all targets for 

selection using the methods of our invention. Variants of either T7 RNA polymerase, reverse 
transcriptase (RT), or RNAseH can be provided within the aqueous compartments of the 
emulsions, and selected for under otherwise limiting conditions. These variants can be 
mtroduced via Rcoli '^gene pellets" (i.e., bacteria express the polypeptide), or other means as 

25 described else y/here in this document Initial release in emulsion may be mediated by 
en^matic (for example^ lambda lysogen) or thermal lysis, or other methods as disclosed here. 
The latter may necessitate the use of agents that stabilize enzymatic activity at transientiy 
elevated temperatures. For example, it may be necessary to include amounts of proline, 
glycerol, trdialose or otiier stabilismg agents as known in the art to effect stahilisatioii of 
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thennosensitive enzymes such as reverse transcriptase. Furthermore, stepwise removal of the 
agent may be undertaken to select for increased stability of the thennosensitive enzyme. 

Alternatively, and as disclosed elsewhere, variants may be produced via coiqpled 
transcription translation, with the expressed products feeding into the 3SR cycle. 

S It will also be ^pteciated that it is possible to replace reverse transcriptase with the 

fliemiostable Tttx DNA polymerase. Ttii DNA polymerase is known to have reverse 
transcriptase activity and tixe RNA template is effectively reverse-transcribed mto template 
DNA using tiiis enzyme. It is therefore possible to select for useful variants of tiiis enzyme, by 
for example, introducing bacterially expressed T7 SNA polymerase variants into emulsion 
10 and preincubation at an otiierwise non-permissive tenq[)eratare. 

Example 18 below is an exanq)le showing one way in which the. methods of our 
invention may be applied to selection of replicases using self-sustained sequence replication 
(3SR). 

Ligation Amplification flLAR/LAS^ 

15 Ligation amplification reaction or ligation amplification system uses DNA Ugase and 

four oligonucleotides, two per target strand. This technique is described by Wu, D. Y. and 
Wallace, IL B. (1989) Genomics 4:560. The oligonucleotides hybridize to adjacent sequences 
on the taj^et DNA and are joined by the ligase. The reaction is heat denatured and the cycle 
repeated. 

20 By analogy to the application to polymerases, our method may be q)plied to ligases in 

particular fix>m thermophilic organisms. Oligonucleotides complementary to one strand of the 
ligase gene sequence are synthesized (either as perfect match or comprising targeted or 
random diversity). The two end oligos overlap into the vector or untranslated regions of the 
ligase g^. Hie ligase gene is either cloned for esqpression in an appropriate host and 

25 con^artmentalized togetti^ with the oligonucleotides and an appropriate energy source 
(usually ATP (or NADPH)). If necessary, flie ligase expressed as above in bacteria is released 
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from the cells by thennal lysis. Compartments contain q)propriate buffer together wift 
appropriate amoimts of an appropriate energy source (ATP or NADH) and oligonucleotides 
encoding the whole of the ligase gene as well as flanking sequences required for cloning. 
Ligation of oligonucleotides leads to assembly of a fiill-length ligase gene (templated by the 
5 ligase gene on tiie expression plasmid) by an active ligase. In compartments containing an 
inactive ligase, no assembly will take place. As Avith polymerases, the copy number of a ligase 
gene X after self-ligation will preferably be proportional to the catalytic activity under the 
selection conditions of the ligase X it encodes. 

After lysis of the cell, Ihermocyoling leads to anneaUng of the oligonucleotides to the 
10 ligase gpne. However, ligation of the oligos and thus assembly of tiie full-length ligase gene 
depends on the presence of an active ligase in the same compartment. Thus only genes 
encoding active ligases will assemble Aeir own encoding genes from Ae present 
oligonucleotides. Assembled genes can then be anq>lified, diversified and recloned for anoflier 
round of selection if necessary. The methods of our invention are tfa^foie suitable for the 
15 selection of ligases, which are &ster or more efScient at ligation. 

As noted elsewhere, the ligase can be produced either in situ by e3q)ression from a 
suitable bact^ or other host, or by in vitro translation. The ligase may be an oligonucleotide 
(e.g. ribo or deoxiribozyme) ligase assembUng its own sequence from available fragments, or 
the ligase may be a conventional (polypeptide) ligase. The length of the oligonucleotides will 

20 depend on fbe particular reaction, but if necessary, they can be very short (e.g. triplets). As 
noted elsewhere, the method of our uivention may be used to select for an agent enable of 
modulating ligase activity, either direcfly or mdirectiy. For example, tiie gene to be evolved 
may be another enzyme or enzymes that generates a substrate for tiie ligase (e.g. NADH) or 
consumes an mhibitor. In this case tiie oligonucleotides encode parts of the oflier enrjrme or 

25 enzymes etc. 

The ligation reaction between oligonucleotides may incorporate alternative chemistries 
e.g. amide linkages. As long as the chemical linkages do not interfere witii templated copying 
of tiie opposite strand by any replicase (e.g. reverse transcriptase), a wide variety of linkage 
diemistries and ligases that catalyse it may be evolved. 
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op Reolicase 

In this technique, KNA replicase for the bacteriophage Qp, which replicates single- 
stranded RNA, is used to anq)Iify the target DNA, as described by Lizardi et al (1988) Bio. 
Technology 6:1 197. First, tiie target DNA is hybridized to a primer including a T7 promoter 

5 and a Qp 5* sequence re^on. Using this primer, reverse transcriptase generates a cDNA 
connecting Ae primer to its 5* end in tiie process. These two steps are sunilar to the TAS 
protocol. The resulting heteroduplex is heat denatured Next, a second primer contaum^ a QP 
3* sequence region is used to initiate a second round of cDNA synfliesis. This results in a 
double stranded DNA containing both 5* and 3' ends of flie Qp bacteriophage as well as an 

10 active T7 RNA polymerase binding site. T7 RNA polymerase flien transcribes ftie double- 
stranded DNA into new RNA, which mimics the Qp. After extensive washing to remove any 
unhybridized probe, the new RNA is elated from the target and replicated by Qp replicase. 
The latter reaction creates 10^ fold amplification in approximately 20 minutes. Significant 
background may be formed due to minute amounts of probe RNA that is non-spedfically 

1 5 retained during the reaction. 

A reaction employing Qp replicase as described above may be used to build a 
continuous selection reaction in an alternative embodiment accordmg to our inv^tion. 

For example, the gene for Qp replicase (with ai^priate 5 ' and 3 ' regions) is added to 
an in vitro translation reaction and compartmentalised* In compartments, the replicase is 

20 expressed and hnmediately starts to replicate its own gene. Only genes encoding an active 
replicase replicate themselves. Replication proceeds until NTPs are exhausted. However, as 
NTPs can be made to difiiise through the emulsion (see flie description of ndk in the 
Exan^les), the replication reaction may be *Ted" from the outside and proceed much longer, 
essentially until there is no room left within the compartments for fiirther replication. It is 

25 possible to propagate the reaction further by serial dilution of tiie emiilsion mix into a fresh 
oil-phase and re-emulsification after addition of a fi:esh water-phase contdning NTPs. QP 
replicase is known to be very error-prone, so replication alone will introduce lots of random 
diversity (which may be deskable)- The mefliods described here allow tiie evolution of more 
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specific (e.g. primer dqpendent) forms of QP-replicase. As with other replication reactions (in 
particular ndk in the Examples) a range of enzymes feeding the reaction may be evolved. 

Other Amplification Techniques 

Alternative amplification technology may be exploited in the present invention. For 
S example, rolling circle amplification (Eizardi et aL, (1998) Nat Genet 19:225) is an 
amplification technology available commercially (RCAT™) vdnch is driven by DNA 
polymerai^ and can replicate drcular oligonucleotide probes witii either linear or ^metric 
Mnetics under isothermal conditions. 

In the presence of two suitably designed primers, a geomettic amplification occurs via 
10 DNA strand disfplacement and hyperbranching to g^erate 10^^ or more copies of each cucle 
in 1 hour. 

If a single primer is used, RCAT generates in a few minutes a linear cham of 
thousands of tandemly linked DNA copes of a target covalently linked to that target 

A further technique, strand displacement amplification (SDA; Walker et al., (1992) 
15 PNAS (USA) 80:392) begins witii a specifically defined sequence unique to a specific target 
But unlike other techniques which rely on thermal cycling, SDA is an isothermal process that 
utilizes a series of primers, DNA polymerase and a restriction enzyme to exponentially 
amplify the unique nucleic acid sequence. 

SDA comprises both a target generation phase and an e:q)onential amplification 

20 phase. 

In target g^^ration, double-stranded DNA is heat denatured creating two single- 
stranded copies. A series of specially manu&ctured primers combine with DNA polymerase 
(amplification primers for coiqdi^ tiie base sequence and bun^ primers for displadng the 
newly created strands) to form altered targets ciq)able of exponential amplification. 



wo 02/22869 



PCT/GBOl/04108 



34 

The exponential amplification process begins with altered targets (single-stranded 
partial DNA strands with restricted enzyme recogmtion sites) firom the target generation 
phase. 

An amplification primer is bound to each strand at its conq>limentary DNA sequence. 
5 DNA polymerase then uses the primer to identify a location to extend the primer &om its 3' 
end» using the altered target as a template for adding individual nucleotides. The ^cteoded 
primer thus forms a double-stranded DNA segment containing a complete restriction enzyme 
recognition site, at each end. 

A restriction enzyme is then bound to the double stranded DNA segment at its 
10 recognition site. Hie restriction ens^e dissociates from tiie recogmtion site after having 
cleaved only one strand of die double-sided segment, forming a nick. DNA polymerase 
recognizes the nick and extends the strand fix>m the site, displacing the previously areated 
strand. The recognition site is thus repeatedly nicked and restored by the restriction enzyme 
and DNA polym^:ase with continuous di^lacemrat of DNA strands containmg the target 
IS segment 

Each displaced strand is thai available to anneal with amplification primers as above. 
The process continues with repeated nickmg, extoision and displaconent of new DNA 
strands, resulting in exponential amplification of the original DNA target 

Selection of Catalytic RNA 

. 20 Known methods of in-vitro evolution have been used to generate catalytically active 

RNA molecules (ribozymes) with a diverse range of activities. However, these have involved 
selection by self-modification, which inherentiy isolates variants that rely on proximity 
catalysis and which display reduced activities in trans. 

Compartmentalisation afifords a means to select for truly trans-acting ribozymes 
25 capable of multiple turnover, without the need to tether substrate to the riboscyme by covalent 
linkage or hydrogen-bonding Q..e., base-pahing) interactions. 
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In its simplest case, a gene encoding a libozyme can be introduced into emulsion and 
readily transcribed as demonstrated by the transcription and the 3SR amplification of the KNA 
aicoding Taq polymerase in situ as follows: The Tag polymerase gene is first transcribed in 
emulsion. lOOjd of a reaction mix comprising 8QmM HEPES-KOH 7.5), 24mM MgCk. 
5 2mM spermidine, 40mM DTT, rNTPs (30mM), 50ng TJ-Taq template (see Example 18. 
Selection Usmg Self-Sustained Sequence Replication (3SR)), 60 units T7 KNA polymerase 
(USB), 40 units RNAsin (Promega) is emulsified usmg the standard protocoL Bmulaons are 
incubated at 37^C for up to 6 hours and analysis of reaction products by gel electrophoresis 
showed levels of RNA production to be comparable to those of tiie non-emulsified control. 

10 By creating a 5* overhang (e.g. by ligation of either DNA or RNA ad£q)tors) in the 

emulsified gene, KNA variants are selected for ^th tiie ability of carrying out the template 
directed addition of successive dNTPs in trans (i.e. polymerase activity, see Figure 6). Genes 
that have been "fiUed-in" may be rescued by PGR using primers complimentary to the single- 
stranded region of the gene (i.e., the region, \^Wch is single stranded prior to ribossyme fill-in) 

15 or by capture of biotin (or otherwise) modified nucleotides that are incorporated followed by 
PGR. In compartments without catalytic RNA activity, this region remains single stranded, 
and PGR will fail to amplify the traiplate (alternatively no nucleotides are incorporated and 
the template is not captured but washed away). 

A coupling approach can also be used to further extend the range of enzymatic 
20 activities that could be selected for. For example, co-emulsification of a DNA polymerase 
with the gene described above (5* ovoiiang) can be used to select for ribozymes that convert 
an otherwise unsuitable NTP substrate into one that can be utilised by the polymerase. As 
before, the "fiUed-in" gene can tiien be rescued by PGR- Hie above approach can also be used 
to select for protem polymerase enzyme produced in-situ fix)m a similar template (i.e. witti 3' 
25 overhang). A diagram showing the selection of KNA having catalytic activity is shown as 
Figure 6. 
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Selection of Agents Capable of modifying Repucase Activity 

In ano&er embodiment, our invention is used to select for an agent capable of 
modifying the activity of a replicase. In this embodiment, a pool of nucleic acids is generated 
comprising members encoding one or more candidate ^ents« Members of the nucleic acid 
5 libniry are compartmentalised together ^th a replicase (v^ch, as explained above, is able 
only to act on the nucleic acid encodii^ the agent). 

The candidate agents may be fimctionally or chemically distinct £x)m each other, or 
they may be variants of an agent Imown or suspected to be cspible of modulating replicase 
activity. Members of tiie pool are tbm segregated into compartmmts together with the 

10 polypqitides or polynucleotides encoded by them, so that preferably each conq>artment 
comprises a single member of the pool together "with its cognate encoded polypeptide. Each 
conq)artment also comprises one or more molecules of the replicase. Thus, the encoded 
polypq>tide agent is able to modulate the activity of the replicase, to prevent or enhance 
replication of the compartmentalised nucleic acid (i.e., the nucleic acid ^icoding the agent). Jn 

15 tiiis way, the polypeptide agent is able to act via tiie replicase to increase or decrease the 
number of molecules of its ^coding nucleic add. La a highly preferred embodiment of the 
invention, the agent is capable of enhancing replicase activity, to enable detection or selection 
of the agent by detecting the encoding nucleic acid. 

The modulating agent may act durectty or indirectiy on the replicase. For example, the 
20 modulating agent may be an enzyme comprising an activity, which acts on tiie replicase 
molecule, for example, by a post-translational modification of replicase, to activate or 
inactivate the replicase. The agent may act by taking off or putting on a ligand fiom the 
replicase molecule. It is known that many replicases such as polymerases and ligases are 
regulated by phosphorylation, so that in preferred embodiments the agent according to the 
25 invention is a kinase or a phosphoiylase. The modulating agent may also directiy interact with 
die replicase and modify its properties (e.g. Thioredoxin & T7-DNA polymerase, members of 
the replisome e.g. clamp, helicase etc. with DNA polymerase HQ. 
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Alternatively^ the modulating agent may exert its effects on tiie replicase in an indirect 
manner. For example, modulation of replicase activity may take place via a third body, which 
third body is modified by the modulating agent, for example as described above. 

Furtheimore, the modulating agent may be an enzyme, which forms part of a pathway, 
5 which produces as an end product a substrate for the replicase. In tiiis embodiment, the 
modulating agent is involved in the synthesis of an intermediate (or the end product) of the 
paOxmy. Accordingly, the rate of replication (and hence the amount of nucleic acid encoding 
the ageat) is dq>endent on the activity of the modulating agent 

For example, the modulating agent may be a kinase that is involved in the bios^tiiesis 
10 of bases, deo^grribonucleosides, deoxyiibonucleotides such as dAMP, dCMP, dGMP and 
dTMP, deoxyribonucleoside diphosphates (such as dADP, dCDP, dCTP and dTDP), 
deo}o^bonucleoside triphosphates such as dA17, dCTP, dGTP or dTTP, or nucleosides, 
nucleotides such as AMP, CMP, OMP and UMP, nucleoside diphosphates (such as ADP, 
CDP, CTP and UDP), nucleoside triphosphates such as ATP,CTP, GTP or UTP, etc. The 
IS modulating ageat may be involved in the synthesis of other intermediates in the biosynthesis 
of nucleotides (as described and well known firom biochemical textbooks such as Stryer or 
Lehninger), such as IMP, S-phospho-a-D-ribose-l-pyrophosphoric acid, S-phospho-p-D- 
ribossylamine, S-phosphoribosyl-glycinamide, S-phosphoribosyl-N-fonnylglycinamide, etc. 
Thus, the agent may comprise an oizyme such as ribosephosphate pyrophosphokinase, 
20 phosphoribosylglycinamide synthetase, etc. Other examples of such agents will be apparent to 
those skilled in the art. The methods of our invention allow the selection of such agents with 
improved catalytic activity. 

In yet another embodiment, the modulator functions to '"unblock^ a constituait of the 
replication cocktail (primers, dNTP, replicase etc). An eicample of a blocked constitu^t 
25 would be a primer or dNTP with a chemical moiety attached tiiat inhibits the replicase used in 
the CSR cycle. Alternatively, the pair of primers used could be covalentiy tethered by a 
linldng agen^ with cleavage of the agent by the modulator allowing both primers to amplify 
its gene in the presence of siq>plemented replicase. An exanq)le of a linking agent would be a 
peptide nucleic add (PNA), Additionally, by designing a large oligonucleotide that encodes a 
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pair of primer sequences interspersed by target nxicleotide sequence, novel site-specific 
restriction enzymes could be evolved. As before, the rate of replication (and hence the amount 
of nucleic acid encoding tiie agent) is dependent on the activity of the modulating agent 
Alternatively the modulator can modify the 5' end a primer such that amplification products 
5 incorporatmg the primer can be captured by a suitable agent (e.g. antibody) and thus enriched 
andreanq>lified 

In a further onbodiment, the scope of CSR may be further broadened to select for 
agents that are not necessarily thermostable. Delivery vehicles (e.g. Kcott) containing 
escpression constructs that encode a secretable form of a modulatoi/replicase of interest aie 

10 compartmentalised. Liclusion of an inducing agent in the aqueous phase and incubation at 
permissive temperature (e.g. 3TQ allows for expression and seoetion of the 
modulator/teplicase into the compartment SufiGdent time is then allowed for tiie modulator to 
act in aiiy of the aforementioned ways to fecilitate subsequent anq[>lification of the gene 
encoding it (e.g. consume an inhibitor of rq)lication). The ensuing tenq>erature change during 

IS the amplification process serves to rid the con^artment of host cell enzymatic activities (tiiat 
have up to tiiis point been segregated fix>m tiie aqueous phase) and release the encoding gene 
for amplificatiorL 

Tlius, according to an embodiment of our invention, we provide a method of selecting 
a polypeptide involved in a pathway which has as an end product a substrate "wiAch is 

20 involved in a replication reaction C'a pathway polypeptide")* the method comprising the steps 
of: (a) providing a replicase; (b) providing a pool of nucleic adds comprising members each 
encoding a pathway polypeptide or a variant of the pathway polypeptide; (c) subdividing the 
pool of nucleic acids into compartments, such ttiat each compartment comprises a nucleic 
acid member of the pool, the pathway polypeptide or variant encoded by the nucldc acid 

25 member, the replicase, and otiier components of the pathway; and (d) detectmg amplification 
of the nucleic add member by tiie replicase. 

The Examples (in particular Example 19 and following Exanq>les) show the use of our 
invention m the selection of nucleoside diphosphate kinase (NDP Kinase), winch catalyses the 



wo 02/22869 



PCT/GBOl/04108 



39 

transfer of a phosphate group fiom ATP to a deoxynucleoside diphosphate to produce a 
deoxynucleoside triphosphate. 

In yet another embodiment, the modulating agent is such that it consumes an inhibitor 
of replicase activity. For example, it is known that heparin is an inhibitor of replicase 
S ^lymerase) activity. Our method allows the selection of a heparinase with enhanced activity, 
by compartmentalisation of a library of nucleic acids encoding heparinase or variants of this 
enzyme in tiie presence of heparin and polymerase. Heparinase variants witib enhanced 
activity are able to faieak down heparin to a greater extent or more n^i^^^ 
inhibition of replicase activity within the conq)artment and allowing the replication of the 
10 nucleic add within tiie compartment (i.e., the nucleic acid encoding that heparinase variant). 

SELECTION OF IKTBRACnNG POLYPEPTIDES 

The most important systems for the selection of protein-protein interactions are in vivo 
methods^ with the most important and best developed being the yeast two-hybrid systm 

15 (Fields & Song, Nature (1989) 340, 245-246). hi this system and related approaches two 
hybrid proteins are generated: a bait-hybrid comprismg protein X fused to a DNA-binding 
domain and a prey-hybrid comprising protein Y fused to a transcription activation domain 
with cognate interaction of X and Y reconstituting the transcriptional activator. Two other in 
vivo systems have been put forward in which the polypeptide chain of an en^me is expressed 

20 in two parts fused to two proteins X and Y and in which cognate X-Y interaction reconstitutes 
function of the enzyme (Karimova (1998) Proc Natl Acad Sci U S A, 95, 5752-6; Pelletier 
(1999) NatBiotechnol 17, 683-690) conferring a selectable phenolype on the cell. 

It has recently bera shown that Taq polymerase can be split in a similar way 
(Vainshtein et al (1996) Protein Science 5, 1785). According to our invention, tii^fore, we 
25 provide a method of selecting a pair of polypeptides capable of stable interaction by splitting 
Taq polymerase or any ^izyme or factor amdliaiy to the polymerase reactiort 

The method comptises several steps. The first step consists of providing a first nucleic 
add and a second nucleic acid* The first nucleic acid encodes a first fusion protein comprising 
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a first subdomain of a replicase (or other see above) enzyme fused to a first polypeptide, while 
the second nucleic acid encodes a second fusion protein comprising a second subdomain of a 
replicase (or other see above) enzyme fused to a second polypeptide. The two fusion proteins 
are such that stable interaction of the first and second replicase (or otiier see above) 
5 subdomains generates replicase activity (either directiy or indirectiy). At least one of the first 
and second nucleic acids (preferably both) is provided in the form of a pool of nucleic acids 
encoding wiants of the le^ective first and/or second polypeptide(s). 

The pool or pools of nucleic acids aie fbesa subdivided into conq)arbnents, such tiiat 
eadi cono^mrtment comprises a first nucldc acid and a second nucleic acid togetiier with 
1 0 respective fusion proteins encoded by the first and second nucleic acids. The first polypqitide 
is tiien allowed to bind to the second polypeptide, such that binding of the first and second 
polypeptides leads to stable interaction of the replicase subdomains to generate replicase 
activity. Finally^ amplification of at least one of the first and second nucleic acids by the 
replicase is detected 

IS Our invention therefore encompasses an in vitro selection system \^exeby 

reconstitution of replicase function through Ifae cognate assodation of two polypeptide ligands 
drives amplification and linkage of the genes of the two ligands. Such an in vitro two-hybrid 
system is particularly suited for the investigation of protein-protein interactions at liig}i 
temperatures, e.g. for the investigation of the protenomes of thermophilic organisms or the 

20 engineering ofhighly stable interactions. 

The system can also be applied to &e screening and isolation of molecular compounds 
that promote cognate interactions. For example, compounds can be chenodcally linked to either 
primers or dNTPs and thus would only be incorporated into amplicons if promoting 
associatioa In order to prevent cross-over, such compounds would have to be released only 
25 after compartmentalisation has taken place, e.g. by coupling to microbeads or by inclusion 
into dissolvable microspheres. 
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Single Step and Mxjltiplb Step Selbctions 

The selection of suitable encapsulation conditions is desirable. Depending on the 
complexity and size of the Uhraiy to be screened, it may be beneficial to set up the 
encapsulation procedure such tiiat 1 or less than 1 nucleic acids is encapsulated per 
5 microcapsule or compartment This will provide the greatest power of resolution. Where the 
library is larger and/or more complex, however, this may be impracticable; it may be 
preferable to encapsulate or compartmentalise several nucleic acids togetiier and rely on 
repeated application of the method of the invention to achieve sorting of the desired activity. 
A combination of encapsulation procedures may be used to obtain the desired eniichmenL 

1 0 Theoretical studies indicate that the larger the number of nucleic adds variants created 

the more likely it is tiiat a molecule will be created with the properties deshed (see Perelscn 
and Ost^, 1979 JTheor Biol, 81, 64570 for a description of how this ^plies to repertoires of 
antibodies). Recentiy it has also been confirmed practically that larg^ phage-antibody 
repertoires do indeed give rise to moire antibodies witii better binding afBnities than smaller 

15 repertoires (Griffiths et al., (1994) Embo J, 13, 3245-60). To ensure that rare variants are 
generated and thus are capable of being selected, a large library size is desurable. Thus, the use 
of optimally small microcapsules is beneficial. 

hi addition to the nucleic adds described above, the microcapsules or compartments 
according to the invention may conqnise further components required for the replication 

20 reaction to take place. Other components of the system may for example comprise those 
necessary for transcription and/or translation of the nucleic acid. These are selected for the 
requirements of a specific system &om the following a suitable bufier, an in vitro 
transcription/replication system and/or an in vitro translation system containing all tiie 
necessary ingredients, enzymes and cofactors, KNA polymerase, nucleotides, nucleic acids 

25 (natural or synthetic), transfer KNAs, ribosomes and amino acids, and the substrates of the 
reaction of interest in order to allow selection of the modified gene product 
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Bufifer 

A suitable buffer will be one in which all of the desired components of the biological 
system are active and will therefore depend upon the requirements of each specific reaction 
system. Buffers suitable for biological and/or chemical reactions are known in tiie art and 
S recipes provided in various laboratory texts (Sambrook et ah, (1989) Molecular cloning: a 
laboratory manual Cold Spring Harbor Laboratory Press, New York). 

Jwvf^o Translation 

The replicase may be provided by expression fix>m a suitable host as described 
elsewhere, or it may be produced by in vitro transcription/translatLon in a suitable system as 
10 known in die art 

The in vitro translation srystem wiU usually comprise a cell extract, typically fiom 
bacteria (Zubay, 1973, Anna Rev Genets 7, 267-87; Zubay, 1980, Methods Enzymol, 65, 856- 
77; Lesley et d., 1991 J Biol Chem 266(4), 2632-8; Lesley, 1995 Methods Mol Biol 37, 265- 
78.)> rabbit reticulocytes (Pelham and Jackson, 1976, Eur J Biochem, 67, 247-56), or wheat 

15 germ (Anderson et a/., 1983, Methods Emymol, 101, 635-44). Many suitable systems are 
commercially available (for example from Promega) including some which will allow coupled 
transcription/translation (all the bacterial systems and the reticulocyte and \dieat germ TNT 
extract systems from Promega). The mixture of amino acids used may include synthetic amino 
acids if desired, to increase the possible number or variety of proteins produced in the library. 

20 This can be accomplished by charging tRNAs with artificial amino acids and using these 
tRNAs for the in vitro translation of tiie proteins^ to be selected (EUman et al., 1991, Methods 
Enzymol 202, 301-36; Beimer, 1994, Trends Biotechnol 12, 158-63; Mendel et al., 1995, 
Anrm Rev Biophys Biomol Struct, 24, 435-62). Particularly desirable may be the use of in vitro 
translation systems reconstituted fiom purified components like the PURE system (Sfaunizu et 

25 al (2001) Nat. Biotech., 19, 751). 

After each round of selection tiie emichmmt of the pool of nucleic acids for those 
encodmg the molecules of uiterest can be assayed by non-cona^Mirtmentalised in vitro 
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transcription/replication or co\q)led tnmscription-translatioii reactions. The selected pool is 
cloned into a suitable plasmid vector and RNA or recombinant protein is produced fiom the 
individual clones for furth^ purification and assay. 

The invention moreover relates to a method for producing a gene product, once a 
5 nucleic acid encoding the gene product has been selected by the method of the invention. 
Clearly, the nucleic acid itself may be directly e3q)ressed by conventional means to produce 
the gene product However, alternative techniques may be employed, as will be apparent to 
those skilled in fhe art. For example, die genetic information incorporated in the gene product 
may be incorporated into a suitable ^lession vector, and e3q)ressed therefiontL 

10 COMPARTMENTS 

As used here, the term "^compartmenf ' is synonymous with ^Wcrocapsule^ and the 
tams are used inteichangeably. The fimction of tfie compartment is to enable co~localisation 
of tibe nuclmc acid and the corresponding polypeptide encoded by the nucleic acid This is 
preferably achieved by die ability of the compartment to substantially restrict difEusion of 
15 template and product strands to other compartments. Any replicase activity of die polypeptide 
is dierefore restricted to being exercised on a nucleic add within Ihe confines of a 
compartment, and not other nucleic adds in other compartments. Another function of 
conq>artments is to restrict diffusion of molecules generated m a chemical or en2ymatic 
reaction that feed or unblock a replication reaction. 

20 The compartments of the present invention therefore require appropriate physical 

IHOperties to allow the working of the invention. 

First, to ensure that the nucleic adds and polypeptides do not difiuse between 
compartments, the contents of each compartment must be isolated firom the contents of the 
surrounding compartments, so that there is no or litde exchange of the nucldc adds and 
25 polypeptides between the compartments over a significant timescale. 
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Second, the method of the present invention requires that there are only a limited 
number of nucleic acids per compartment, or that all members vdthui a single compartment 
are clonal (Le, identical). This ensures that the polypeptide encoded by and corresponding to 
an individual nucleic acid will be isolated from other different nucleic acids. Thus, coupling 
5 between nucleic acid and its corresponding polypeptide will be highly specific. The 
enrichment factor is greatest with oii average one or fewer nucleic acid clonal species per 
compartment, the linkage between nucleic acid and the activity of the encoded polypeptide 
being as tight as is possible, since the polypeptide encoded by an individual nucleic acid will 
be isolated fiom the products of all other nucleic acids. However, even if the Iheoredcally 

10 optimal situation of, on average, a single nucleic add or less per cotx^partment is not used, a 
ratio of 5, 10, SO, 100 or 1000 or more nucleic acids per compartment may prove beneficial in 
selecting fiom a large library. Subsequent rounds of selection, including renewed 
compartmentalisation with differing nucleic acid distribution, will permit more stringent 
selection of the nucleic acids. Preferably, on average Ikeie is a single nucleic acid clonal 

15 species, or fewer, per compartment 

Moreover, each compartm^t contains a nucleic acid; this means that whilst some 
compartments may r^nain empty, tiie conditions are adjusted sudi that, statistically, each 
conq[>artment will contain at least one, and preferably only one, nucleic acid. 

. Third, the. formation and the composition of the con^)artments must not abolish the 
20 fimction of tiie machinery for the esxpiession of the nucleic acids and the activity of the 
polypqytides. 

Consequently, any compartmentalisation system used must fulfil these three 
requirements. The appropriate system(s) may vary depending on the precise nature of the 
requirements in each application of the invention, as will be apparent to the skilled person^ 

25 Various technologies are available for compartmentalisation, for example, gas aphrons 

(Juaregi and Varley, 1998, Biotechnol Bioeng 59, 471 and prefebricated nanowells (Huang 
and Schreiber, 1997, Proc. Natl Acad Set USA, 94, 25). For different applications, different 
compartment sizes and surfece chmustiies, as discussed in fiirther detail below, may be 
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desirable. For example, it may be suflBcient to utilise difiusion limiting porous materials like 
gels or alginate (Ehraget et al,, 1997, Int J Macromol 21, 47) or zeolithe-type materials. 
Furthermore, >^ere in-situ PCR or in-cell PGR is carried out, cells may be treated witii a 
cross-linking fixative to form porous compartments allo^dng difiusion of dNTPs, ensues 
5 and primers. 

A ivide variety of cpnq>artmentalisation or microenc^sulation procedures are 
available (Benita, S., Ed. (1996). Microencapsulation: methods and industrial q)plications. 
Drugs and pharmaceutical sciences. Edited by Swarbrick, J. New Yodc: Marcel Dekker) and 
may be used to oeate the compartments used in accordance with the present invention. 
10 Indeed, more than 200 microenctqpsulation or compartmentalisation mediods have been 
identified in the literature (Finch, C. A. (1993) Encapsulation and controlled release. Spec, 
Publ'H Soc. Chenu 138, 35) 

These include membrane enveloped aqueous vesicles such as lipid vesicles (liposomes) (New, 
15 R. R, C, Ed. (1990), Liposomes: a practical approach. The practical appraoch series. Edited 
by Rickwood, D. & Hames, B. D. Oxford: Oxford University Press) and non-ionic surfactant 
vesicles (van Hal, D. A., Bouwstra, J. A. & Junginger, H, E. (1996). Nonionic sur&ctant 
vesicles containing estradiol for topical application. In Mcroencapsulation: methods and 
industrial applications (Benita, S., edl), pp. 329-347, Marcel Dekker, New York.). These are 
20 closed-membranous capsules of single or mxiltiple bilayers of non-covalently assembled 
molecules, with each bilayer separated from its neighbovir by an aqueous compartm^t In the 
case of liposomes the membrane is composed of lipid molecules; these are usually 
phospholipids but sterols such as cholesterol may also be incorporated into the membranes 
(New, R. R. C, Ed. (1990). Liposomes: a practical approach. The practical q)praoch series, 
25 Edited by Rickwood, D. 4 Hames, B. D. Oxford: Oxford University Press). A variety of 
enzyme-catalysed biochemical reactions, including RNA and DNA polym^isation, can be 
performed widiin liposomes (Chakrabarti JMol EvoL (1994), 39, 555-9; Obediolzer Biochem 
Biopf^ Res Commun. (1995), 207, 250-7; Oberholzer Chem Biol (1995) 2, 677-82.; Walde, 
BiotechmlBiaeng (1998), 57, 216-219; Wick & Luisi, Chem Biol (1996), 3, 277-85). 

30 
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With a membrane-enveloped vesicle system much of the aqueous phase is outside the vesicles 
and is therefore non*compartmentalised Hiis continuous, aqueous phase should be removed 
or the biological systems in it inhibited or destroyed (for example, by digestion of nucleic 
acids with DNase or RNase) in order that the reactions are limited to the compartmentalised 
5 microcapsules (Luisi et al.. Methods EmymoL 1987, 136, 188-216). 

Enzyme-catalysed biochemical reactions have also been demonstrated in miarocq)sule 
compartments generated by a variety of other methods. Many eazymes are active in reverse 
micellar solutions (Bm & Walde, Eur J Biochem. 1991, 199, 95-103.; Bra & Walde, Biochem 

10 Mol Biol Int. 1993, 31, 685-92; Creagh et al.. Enzyme Mcrob Technol 1993, 15, 383-92; 
Haber et ah, 1993UNABLE TO FIND; Kumar et aL, Biophys J 1989, 55, 789-792; Luisi, P. 
L. & B., S.-R (1987). Activity and conformation of em^es in reverse micellar solutions. 
Methods Enzymol 136(188), 188-216; Mao & Walde, Biochem Biophys Res Commun 1991, 
178, 1105-1112; Mao, Q. & Walde, P. (1991). Substrate effects on the enzymatic activity of 

15 alpha-chymotrypsin in reverse micelles. Biochem Biophys Res Commun 178(3), 1 105-12; Mao 
Eur J Biochem. 1992, 208, 165-70; Perez, G. M., Sanchez, F. A. & Garcia, C. F. (1992). 
Application of active-phase plot to the kinetic analysis of lipo^Q^enase in reverse micelles. 
Biochem /.; Walde, P., Goto, A., Monnard, P.-A., Wessicken, M. & Luisi, P. L. (1994) 
Oparin's reactions revisited: enzymatic synthesis of poly(adaqrlic acid) in micelles and self- 

20 reproducing vesicles. J. Am. Chem. Soa 116, 7541-7547; Walde, P., Han, D. & Luisi, P. L. 
(1993). Spectroscopic and kinetic studies of lipases solubilized in reverse micelles. 
Biochemistry 32, 4029-34; Walde Eur J Biochem. 1988 173, 401-9) such as the 
AOT-isooctane-water system (Menger, F, M. & Yamada, K. (1979). J. Am. Chem. Soc. 101, 
6731-6734). 

25 

« 

Compartments can also be gen^ated by interfiicial polymerisation and inter&cial 
complexation (Whateley, T. L. (1996) Microcapsules: preparation by interfadal 
polymerisation and mterfacial complexation and their applications. In Ma^oenccpsulation: 
methods and industrial applications CBenita, S., ed), pp. 349-375. Marcel Deldcer, New 
30 York). Microcapsule compartments of flds sort can have rigid, noiqsermeable membranes, or 
semipemieable membranes. Semipemieable microcapsules bordered by cellulose nitrate 
membranes, polyamide membranes and lipid-polyamide membranes can all siqiport 
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biochemical reactions, including multienzyme systems (Chang, Methods Enzymol 1987, 136, 
67-82; Chang, Artif Organs. 1992, 16, 7M; Lim Appl Biochem BiotechnoL 1984, 10:81-5). 
Alguiate/polylysine compartments (Lim & Smi, Science (1980) 210, 908-10), vrfiich can be 
formed under very mild conditions, have also proven to be very biocompatible, providing, for 
5 example, an effective method of encq)sulating living cells and tissues (Chang, Artif Organs. 
(1992) 16, 71-4; SxmASAIOJ. (1992), 38, 125-7). 

Non-membranous comparttnentalisation systems based on phase partitioning of an 
aqueous environment in a colloidal system^ such as an emulsion, may also be used. 

Preferably, the compartments of the present invention are formed from emulsions; 

10 heterogeneous systems of two immiscible liquid phases with one of the phases dispersed in 
the other as droplets of microscopic or colloidal size ^echer, P. (1957) Emulsions: theory a^d 
practice. Reinhold, New York; Shennan, P. (1968) Emulsion sciencei Academic Press, 
London; Lissant, K.J., ed Rmukinns and emulsion technology , Sur&ctant Sci^ce New 
York: Marcel Dekker, 1974; Ussant, KLJ., ed. Emulsions and emulsion tecbnoloigv , 

15 Sur&ctant Science New York: Marcel Dekker, 1984). 

Emulsions may be produced &om any suitable combination of unmiscible liquids. 
Prefembly the emulsion of the present mvention has mter (containing tiiie biochemical 
components) as the phase present in the form of finely divided droplets (the disperse, internal 
or discontinuous phase) and a hydrophobic, immiscible liquid (an ^oil') as the matrix in which 
20 these droplets are suspended (^ nondisperse, continuous or esctemal phase). Such emulsions 
are termed *water-in-oil* (W/0). This has the advantage that the entire aqueous phase 
containing the biodiemical components is compartmentalised in discrete droplets (the internal 
phase). The external phase, being a hydrophobic oil, generally contains none of the 
biochemical components and hence is inert 

25 Hie emulsion may be stabilised by addition of one or more surfece-active agents 

(surfactants). These surfactants are termed emulsifying agents and act at the water/oil interface 
to prevent (or at least delay) separation of the phases. Many oils and many emulsifiers can be 
used for the generation of water*in-oil raiiilsions; a recent compilation listed over 16,000 
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surfectants, many of which are used as emulsifying agents (Ash, M. and Ash, L (1993) 
Handbook of industrial surfactants. Gower, Aldershot). Suitable oils include light ^te 
mineral oil and non4onic surfactants (Schick, 1966 not found) such as sorbitan monooleate 
(Span™80; ICl) and polyoxyethylenesoibitan monooleate (Tween™ 80; ICT) or t- 
5 Octylphenoxypolyethoxyethanol (Triton X-100). 

The use of anionic sur&ctants may also be beneficial. Suitable sur&ctants include 
sodium cholale and sodium taurocholate. Particularly preferred is sodium deoxycholate, 
preferably at a concentration of 0.5% w/v, or below. Inclusion of such surfactants can in some 
cases increase the expression of the nucleic acids and/or the activity of the polypeptides. 
10 Addition of some anionic surfoctants to a non-emulsified reaction mixture completely 
abolishes translation. During emulsification, however, the surfactant is transferred fiom the 
aqueous phase into the inter&ce and activity is restored. Addition of an anionic suriactant to 
the mixtures to be raiulsified ensures that reactions proceed only after compartmentalisation. 

Creation of an emulsion generally requires the application of mechanical energy to 
IS force the phases togetiier. Hiere are a variety of ways of doing tins which utilise a variety of 
mechanical devices, including stirrers (such as magnetic stir-bars, propeller and turbine 
stirrers, paddle devices and vtdusks), homogenisers including rotor-stator homogenisers, 
high-pressure valve homogenisers and jet homogenisers), colloid mills, ultrasound and 
^membrane emulsification' devices (Becher, P. (1957) Emulsions: theory and practice. 
20 Reinhold, New York; Dickinson, E. (1 994) In Wedlock, D. J. (ed). Emulsions and droplet size 
control. Butterworth-Heine-maim, Oxford, Vol. pp. 191-257). 

Aqueous compartments formed in water-in-oil emulsions are generally stable with 
little if any exchange of polypeptides or nucleic acids between compartments. Additionally, it 
is known that several biochemical reactions proceed in emulsion compartments. Moreover, 
25 complicated biochemical processes, notably gene transcription and translation are also active 
in pulsion microcapsules. The technology exists to create emul^ons with volumes all the 
way iq> to industrial scales of thousands of litres (Becher, P. (1957) Emulsions: theory and 
practice. Reinhold, New York; Sherman, P. (1968) Emulsion science. Academic Press, 
London; lissan^ K. ed FmulCTona and emulsion technology. Surffictant Science New 
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York: Marcel Dekker, 1974; Lissant, K.X, ed. Emulsions and emulsion technology. 
Surfactant Science New York: Marcel Dekker» 1984). 

The preferred compartment size will vary depending upon the precise requirements of 
any individual selection process that is to be performed according to the present invoition. In 
5 all cases, there will be an optimal balance between ^ne library size, the required enrichment 
and tiie required concentration of conq)onents in the individual compartments to achieve 
efBcient expression and reactivity of fbe polypeptides* 

The processes of GXpresskm may occur eiflier in situ within each individual 
xnicroc^sule or exogenously within cells (e.g. bacteria) or otho: suitable forms of 

10 subcompartmentalization. Both m vitro transcription and coiq>led transcription-translation 
become less effident at sub-nanomolar DNA concentrations. Because of the requirement for 
only a limited number of DNA molecules to be present in each compartment, this th«:efore 
sets a practical upper limit on the possible compartment size vdiere in vitro transcription is 
used. Preferably, for e^qpression in situ usmg in vitro transcription and/or translation the mean 

15 volume of the con^>artments is less that 5.2 x 10'^^ m^ (corresponding to a sfdierical 
compartment of diameto: less than 1 ^m. 

An alternative is the separation of expression and compartmentalisation, e.g. using a 
cellular host For inclusion of cells (in particular eucaryotic cells) mean compartment 
diameters of larger than 1 0)iM may be preferred. 

20 As shown in the Examples, to colocalize the polymerase gene and encoded protein 

within the same emulsion compartment, we used bacteria (Exoli) overexpressing Taq 
polymerase as ^'delivery vehicles". Kcoli cells (diameter 1-5^M) fit readily into our emulsion 
compartments while leaving room for sufficient amounts of PGR reagents like nucleotide 
triphosphates and primers (as shown in Fig. 2). The denaturation step of the first PGR cycle 

25 it9>tures the bactmal cell and releases the expressed polymerase and its mcoding gene into 
the conqmrtment allowing self-replication to proceed while simultaneously destroying 
bad^und bacterial enzymatic activities. Furdi^more, by analogy to hot-start strategies, tiiis 
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cellular "subcompartmentalizatioH'' prevents release of polymerase activity at ambient 
temperatures and the resulting non-specific ampMcation products. 

The effective DNA or RNA concentration in the compartments may be artificially 
5 increased by various methods that will be well-known to those versed in fte art These 
include^ for exanQ)le» the addition of volume excluding chemicals such as polyetiiylene 
glycols (PEG) and a variety of gene amplification techniques, inchiding transcription using 
RNA polymerases including those from bacteria such as E. coli (Roberts, 1969 Nature 224, 
1 168-74; Blattner and Dahlberg, 1972 JVof New Biol 237, 227-32; Roberts et al, 1975 J Biol 

10 Chem. 250, 5530-41; Rosenberg et aU 1975 J Biol Chem 250, 4755-4764), eukaiyotes e. g. 
(Weil et d., 1979 J Biol Chem. 254, 6163-6173; Manley et al, 1983 Methods EnzymoL 101, 
568-82) and bacteriophage such as T7, T3 and SP6 (Melton et al., 1984 Nucleic Acids Res. 12, 
7035-56.); the polymerase chain reaction (PGR) (Saiki et a/., 1988 Science 239, 487-91); Qp 
repUcase an9)lification (Nfiele et al., 1983 J Mol Biol 171, 281-95; Cahill et cd., 1991 Clin 

15 Chem, 37, 1482-5; OietvOTn and Spirin, 1995 Prog Nucleic Acid Res Mol Biol 51, 225-70; 
Katanaev et al, 1995 FEBS Lett., 359, 89-92); theligase chain reaction (LCR) (Landegren et 
al, 1988 Science, 241; 1077-80; Barany, 1991 PCR Methods Appl, 1, 5-16); and 
self-sustained sequence replication system (Fahy et al., \99\ PCR Methods Appl 1, 25-33) 
and strand displacement ampliBcation (Walker et al, 1992 Nucleic Acids Res. 20, 1691-6), 

20 Gene amplification techniques requiring thermal cycling such as PCR and LCR may also be 
used if flie emulsions and the in vitro transcription or coupled transcription-translation 
systems are thermostable (for example, the coupled transcription-translation systems could be 
made from a thermostable organism sudi as Thermus aquaticus). 

Increasing the efTective local nucleic acid concentration enables larger compartments 
25 to be used effectively. 

The compartment size must be sufficientiy large to accommodate all of the reqiured 
components of the biochemical reactions that are needed to occur within tiie compartment 
For example, in vitro^ both transcription reactions and coupled transcriptionrtranslation 
reactions require a total nucleoside trq)hosphate concentration of about 2mM. 
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For example, in order to transcribe a gene to a single short RNA molecule of 500 
bases in lengtii» Ms would require a minimum of 500 molecules of nucleoside triphosphate 
per compartmrat (833 x 10'^ moles)* In order to constitute a 2mM solution, this number of 
molecules must be contained witiun a compartment of volume 4.17 x 10'^^ litres (4.17 x 10'^ 
5 m^ which if spherical would have a diameter of 93nnL Hence, the prefened lower limit for 
microcapsules is a diameter of approximately O.l^m (lOOmh). 

Wh&i using expression hosts as deliveiy vehicles, there are much less strict 
requirements on flie compartment size. Basically, the compartment has to be of sufficient size 
to contam the expression host as well as sufficient amounts of reagents to cany out liie 

10 required reactions. Thus, m such cases larger compartment sizes >10^M are preferred. By an 
appropriate choice of vector used for expression in the host, the template concentration within 
compartments can be controlled via the vector origin and resulting copy numb^ (e.g. Ecoli: 
colE (pUC) >100, pl5: 30-50, pSC101:l-4). Likewise the concentration of the gene product 
can be controlled by the amount by choice of expression promoter and expression protocol 

15 (e.g. full induction of expression versus promoter leakage). Prefembly, gene product 
concentration is as high as possible. 

Furthermore, the use of feeder compartments allows feeding of substrates from tiie 
outside (see Ghadessy et al .(2001), PNAS, 98, 4552; 01). Feedmg emulsion reactions from 
the outside may allow compartment dimensions O.lpM for ribo2^e selections, as reagmts 
20 do not need to be contained in their entirely within the compartmeait 

The size of onulsion microcapsules or compartments may be varied singly by 
tailoring the emulsion conditions used to form the emulsion according to requuements of the 
selection system. Ilie larger the compartment size, tiie larger is the volume that will be 
required to encapsulate a given nucleic add library, smce thie ultimately lunitmg &ctor will be 
25 the size of the conq>artment and thus tiie number of microc^sule compartments possible per 
unit volume. 

The size of the compartments is selected not only haviog regard to the requuements of 
the replication system^ but also tiiose of the selection system ^ployed for the nucldc acid. 
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Thxis, the components of the selection system, such as a chemical modification system, may 
require reaction volumes and/or reagent concentrations, ^vfaich are not optimal for replication. 
As set forth herein, such lequirements may be accommodated by a secondary re-encapsulation 
step; moreover, they may be accommodated by selecting the compartment size in order to 
S maximise replication and selection as a whole. Empirical determination of optimal 
compartment volume and reagent concentration, for example, as set forth herein, is prefened. 

In a hig^y preferred mibodiment of the present invention, fbs emulsion is a water-in- 
oil emulsion. The water-in-oil emulsion is made by adding an aqueous phase dropwise to an 
oil phase in the presence of a sur&ctant comprismg 4.5% (v/v) Span 80, about 0.4% (vAr) 

10 Tvveen 80 and about 0.05-0.1% (v/v) Triton XlOO in mineral oil preferably at a ratio of 
oiliwat^ phase of 2:1 or 3:1. It spears that the ratio of the three surfactants is important for 
the advantageous properties of the emulsion, and accordingly, our invention also enconqiasses 
a water-in-oil emulsion having increased amounts of surfactant but with substantial^ tfie 
same ratio of Span 80, Tween 80 and Triton XlOO. In a preferred embodiment, the surfactant 

15 comprises 4.5% (v/v) Span 80, 0.4% (v/v) Tween 80 and 0.05% (v/v) Triton XlOO. 

The water-in-oil emulsion is preferably formed under constant stirring ia 2ml round 
bottom biofieeze vials with continued stirring at lOOOrpm for a Gsr&ier 4 or 5 minutes after 
complete addition of the aqueous phase. The rate of addition may be up to 12 drops/min (ca. 
10^1 each). The aqueous phase may include just water, or it may comprise a buffered solution 
20 having additional conqwnents such as nucleic acids, nucleotide triphosphates, etc. In a 
preferred embodiment, the aqueous phase comprises a PGR reaction mix as disclosed 
elsewhere in this document, as well as nucleic acid, and polymerase. The water-in-oil 
emulsion may be formed from 200[il of aqueous phase (for example PGR reaction mix) and 
400^1 oil phase as described above. 

25 The water-in-oil emulsion accorduig to the invention has advantageous properties of 

increased thennal stability. Ibus, no changes in conrpartment size or evidence of coalescence 
is observed after 20 cycles of PGR as judged by laser dif&action and lig^ microscopy. This is 
shown in Figure 2. In addition, polymerase chain reaction proceeded efBdentiy within the 
compartments of this water-in-oil con^sition, to qiproach the rates observed in solution 
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PGR. Average aqueoxis compartmeat dimensions in the water-in-oil emulsion according to our 
invention are on average IS^im in size. Once formed, the compartments of the emulsion 
according to our invention do not permit the exchange of macromolecules like DNA and 
proteins to any significant degree (as shown m Figure 3A). This is presumably because the 
5 large molecular weight and charged nature of the macromolecules precludes difiusion across 
the hydrophobic sur&ctant shell, even at elevated temperatures. 

NUCLEIC ACIDS 

A nucleic add in accordance with the present mvention is as described above. 
Preferably, tihe nucleic add is a molecule or construct selected fix)m &e groiq) consisting of a 

10 DNA molecule, an SNA molecule, a partially or wholly artificial nucldc acid molecule 
consisting of exclusively syudietic or a mixture of naturally-occurxing and synthetic bases, any 
one of the foregoing linked to a polypeptide, and any one of the foregoing linked to any other 
molecular group or construct Advantageously, the other molecular groiq) or construct may be 
selected firom tixe groiq> consistmg of nucldc acids, polymeric substances, particularly beads, 

15 for example pofystyrene beads, magnetic substances such as magnetic beads, labels, such as 
fluorophores or isotopic labels, chemical reagents, binding a^nts such as maccocycles and the 
like. 

The nucleic add may comprise suitable regulatory sequences, such as tiiose required 
for efficient expression of the gene product, for example promoters, enhancers, translational 
20 initiation sequences, polyadenylation sequences, splice sites and the like. 

The terms ^isolating'*, "sorting^ and "selecting", as well as variations thereof, are used 
herein. Isolation, according to tiie present invention, refers to the process of separating an 
entity from a heterogeneous population, fox example a mixture, such that it is fi:ee of at least 
one substance with vfinch it is associated before the isolation process. In a preferred 
25 mbodiment, isolation refers to purification of an entity essentially to homogendty. Sorting of 
an entity refers to the process of preferentially isolating desired entities over undesired 
oatities. In as £BDr as this relates to isolation of ^ desired entities, the tenns 'Isolatmg^ and 
*^rtingf are equivalent The metiiod of the presmt invention permits flie sorting of desired 
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nucleic acids from pools (libraries or rep^ires) of nucleic acids which contain the desired 
nucleic acid. Selecting is used to refer to the process (including the sorting process) of 
isolating an entity according to a particular property thereof. 

"Oligonucleotide" refers to a molecule comprised of two or more 
5 deoxyribonucleotides or ribonucleotides, preferably more than three. The exact size of the 
oligonucleotide will depend on the ultimate function or use of the oligonucleotide. The 
oligonucleotide may be derived syntiietically or by cloning. 

Hie nucleic acids selected accordmg to our invention may be further manipulated. For 
example, nuddc add encoding selected i^licase or interacting polypeptides are incorporated 

10 into a vector, and introduced into suitable host cells to produce transformed cell lines that 
^qiress the gene product The resulting cell lines can then be propagated for rq)rodudble 
qualitative and/or quantitative analysis of the efEect(s) of potential drugs affecting g^e 
product function. Thus gene product esqpressii^ cells may be employed for the identification 
of compounds, particularly small molecdar weight compounds, yMch modulate the function 

15 of gene product Thus host cells esqxressing gene product are usefid for drug sa:eening and it 
is a further object of tiie present invention to provide a method for identifymg conqmunds 
which moddate tiie activity of the gene product, said method comprising e:)q>osing cells 
containmg heterologous DNA encoding gene product, wherein said cells produce functional 
gene product, to at least one compound or mixture of compounds or signal whose ability to 

20 modulate the activity of said gene product is sought to be determined, and thereafter 
monitoring said cells for changes caused by said modulation. Such an ossay enables the 
identification of modulators, such as agonists, antagonists and allosteric modulators, of the 
gene product As used herein, a compound or signal that modulates tiie activity of gene 
product refers to a compound that alters the activity of gene product in such a way tiiat the 

25 activity of gene product is different in the presence of the compound or signal (as compared to 
the absence of said compound or signal). 

Cell-based screening assays can be designed by constructing cell lines in which the 
eiqiression of a reporter protein, i.e. an easUy assayable protein, such as p galactosidase, 
ddorampheoicol acetyltransferase (CAT) or luciferase, is dependent on gene product Sudi an 
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assay enables the detection of compounds that directly modulate gene product function, such 
as compounds that antagonise gene product, or compounds that inhibit or potentiate other 
cellular functions required for the activity of gene product 

The present invention also provides a method to exogenously affect gene product 
5 d&p&adwt processes occurring m cells. Recombinant gene product producing host cells, e.g. 
mflmmnHftn cells, can be contacted with a test compound, and the modulating efifect(s) thereof 
can then be evaluated by comparing the gene product-mediated response in the presence and 
absence of test compound, or relating the gene product-mediated response of test cells, or 
control cells (i.e., cells that do not express ^e product), to the presence of the compound. 

10 NUCmC ACID LIBRARIES 

The method of the present invention is useful for sorting libraries of nucleic adds. 
Herem, tiie terms ''library'', ''repertoire*' and "pool" are used according to then: ordinary 
signification in tiie art, such that a library of nucleic adds encodes a repertoire of gene 
products* In general, libraries are constructed fiom poob of nucleic adds and have properties, 
IS which fiicilitate sorting. Initial selection of a nucleic add fiom a library of nucleic adds usdng 
the presoit invention will in most cases require the screening of a large number of variant 
imcleic acids. libraries of nucleic acids can be created in a variety of difTmnt ways, mcludmg 
the following. 

Pools of naturally occurring nucleic acids can be cloned from genomic DNA or cDNA 
20 (Sambrook et aL, 1989 Molecular cloning: a laboratory manual Cold Spring Harbor 
Laboratory Press, New York.) ; for example, phage antibody libraries, made by PGR 
amplification repertoires of antibody genes fiom immunised or uninununised donors have 
proved very effective sources of functional antibo^ fiagments (Winter et aU 1994 Amm Rev 
Immmol, 12, 433-55.; Hoogenboom, H. R. (1997) Trends BiotechnoL 15, 62-70). Designing 
25 and optimizing library selection strategies for g^erating high-aflSnity antibodies. Trends 
BiotechnoL 15, 62-70; Hoogenboom, ILSL (1997) TVcmfe BiotechnoL, 15, 62-70). Libraries of 
genes can also.be made by encoding all (see for example Smith, QJP. (1985) &;ience, 228, 
1315-7; Parmley, S J. and Smith, CP. (1988) Gene, 73, 305-18) or part of genes (see for 
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example Lowman et cd., (1991) Biochemistry, 30, 10832-8) or pools of genes (see for example 
Nissim, A., Hoogenboom et al, (1994) Embo J, 13, 692-8) by a randomised or doped 
synthetic oligonucleotide. Libraries can also be made by introducing mutations into a nucleic 
acids or pool of nucleic acids 'randomly' by a variety of techniques in vivo, including; using 

5 'mutator strains', of bactoia such as E. coli mutDS (Liao et aLj (1986) Proc Natl Acad Sci U 
S A, 83, 576-80; YamagisW et a/., (1990) Protein Eng, 3, 713-9; Low et a/,, (1996), J Mol 
Bioly 260, 359-68); using the antibody hypennutation system of B-lymphocytes (Y^lamos et 
d.^ (1995), NcOure, 376, 225-9). Random mutations can also be introduced both in vivo and in 
vitro by chemical mutagens, and ionismg or UV irradiation (see Friedberg et a/., 1995, DNA 

10 repair and mutagenesis. ASM Press, Washington D.Q, or incorporation of mutagenic base 
analogues (Freese, 1959, J. Mol Biol, 1, 87; Zaccolo et a/., (1996), J Mol Biol, 255, 589- 
603). 'Random' mutations can also be introduced into genes in vitro during polymerisation for 
example by using ecior-prone polymerases (Leung et a/., (1989), Technique, \, 1 1-15). 

Further diversification can be introduced by using homologpus recombination either in 
15 vrvo (Kowalc^dcowski et al^ (1994) Mcrobiol Rev, 58, 401-65 or in vitro (Stemmer, (1994), 
Nature, 370, 389-9.; Stemmer, (1994) Proc Natl Acad Sci U S A 91, 10747-51). 

AGENT 

As used herein, flie term ''agenf ' includes but is not limited to an atom or molecule, 
i^erein a molecule may be inorganic or organic, a biological efiector molecule and/or a 

20 nucleic acid encoding an agent such as a biological effector molecule, a protein, a polypeptide, 
a peptide, a nucleic add, a peptide nucleic acid (PNA), a virus, a virus-like particle, a 
nucleotide, a ribonucleotide, a synthetic analogue of a nucleotide, a synthetic analogue of a 
ribonucleotide, a modiJGed nucleotide, a modified ribonucleotide, an amino acid, an amino 
acid analogue, a modified, amino acid, a modified amino acid analogue, a steroid, a 

25 proteoglycan, a lipid, a fatty acid and a carbohydrate. An agent may be in solution or in 
suspension (e.g., in crystalline, colloidal or other particulate form). The agent may be m the 
form of a monomer, dtmer, oligomer, etc, or otherwise in a conq>leK« 
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POLYPEPTIDE 

As used herein, the terms ''peptide", "polypeptide" and ''protein" refer to a polyma: in 
which iht monomers are amino acids and are joined together tiirough peptide or disulfide 
bonds. 'Tolypeptide" refers to either a full-length naturally-occurring amino acid chain or a 

5 "fragment tiiereof or peptide", such as a selected region of the polypeptide that bmds to 
another protdn, peptide or polypeptide in a maimer modulatable by a ligand, or to an amino 
acid polymer, or a fragment or peptide thereof which is partially or wholly non-natural. 
'Tiagment 1h^:eof' thus refers to an ammo acid sequence that is a portion of a fuU-lenglh 
polypeptide, betwera about 8 and about SOO amino acids in length, preferably about 8 to about 

10 300, more preferably about 8 to about 200 amino adds, and even more preferably about 10 to 
about 50 or 100 amino adds in length. 'Teptide" refers to a short amino add sequence that is 
10-40 amino adds long, preferably 10-35 amino acids. Additionally, unnatural amino acids, 
for example, p-alanine, phenyl glycine and homoarginine may be included Commonly 
^countered amino adds, whidi are not gene-encoded, may also be used in the present 

15 mventioa All of the amino adds used in the present invention may be either tiie D- or L- 
optical isomer. The L-isomers arc preferred, hi addition, other peptidomunetics are also 
useful, e.g. in linker sequences of polypeptides of tiie present invention (see Spatola, (1983), 
in Chemistry and Biochemistry of Amino Acids, Peptides and Proteins, Wemstein, ed.. Marcel 
Dekker, New York, p. 267). A ''polypeptide bmding molecule^ is a molecule, preferably a 

20 polypeptide, protein or pq)tide, which has tiie ability to bind to another polypeptide, protein or 
peptide. Preferably, this binding ability is modxilatable by a ligand. 

The term "synthetic", as used herein, means that the process or substance described 
does not ordinarily occur in nature. Preferably, a syntiietic substance is defined as a substance 
vMoh is produced by in vitro synthesis or manipulatiorL 

25 The term 'molecule' is used herein to refer to any atom, ion, molecule, macromolecule 

(for example polypeptide), or combination of such entities. The term 'ligand' may be used 
mterchangeably witii the term 'molecule'. Molecules according to the invoxtion may be fi:ee 
in solution, or may be partially or fiiUy iimnobilised They may be present as disoiete entities, 
or may be complied with other molecules. Preferably, molecules according to the invention 
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include polypeptides displayed on the surfiEice of bacteriophage particles. More preferably, 
molecules according to the invention include libraries of polypeptides presented as integral 
parts of the envelope proteins on the outer surface of bacteriophage particles. Methods for the 
pixKluction of libraries encoding randomised polypeptides are knom in the art and may be 
5 applied in the present invention. Randomisation may be total, or partial; in the case of partial 
randomisation, tiiie selected codons preferably encode options for amino adds, and not for 
stop codbns. 

Examples 

Example 1 . Construction of Tag polymerase expression plasnuds 

1 0 The Tag polymerase opm reading frame is amplified by PGR from Thermus aguaticvs 

genomic DNA using primers 1 &2,cut\vitfaA»aI&iSaflandUgatedintopASK^^ 
1994, Gene 151, 131)) cut with Xbal & Sail. pASK75 is an e5q)ression vector which directs 
the synthesis of foreign proteins in K cott xanden: transcriptional control of the tetA promoter 
/operator. 

15 Clones are screened for inserts using primers 3, 4 and assayed for expression of active Tag 
polymerase (Tag pol) (see below). The inactive Tag pol mutant D785H/E786V is constructed 
using Quickchange mutagenesis (Stratagene). The mutated residues are critical for activity 
(DoubUe S. et al, 1998, Nature 391, 251; Kiefer J.R. et al, 1998, Nature 391,304). Resulting 
clones are screened for mutation using PGR screening with primers 3, 5 and diagnostic 

20 digestion of the products with Pmll. Mutant clones are assayed for expression of active Tag 
pol (see below). 

Example.2. Protein Expression and Activity Assay 

Transformed TGI cells are grown m 2xTY O.lmg/ml ampicillin. For expression, 
ovetnigjit cultures are diluted 1/100 into fi«sh2xTY medium and grown to OD600=0.5 at 37 
25 *C. Protein expression is induced by addition of anhydro tetraQrcline to a final concentration 
of 0.2 figAnl. Afier 4 hours fiirtfaer incubation at 37 **C, cells are spun down, washed once, and 
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re-suspended in an equal volume of 1 X SuperTaq polymerase buffer ( 50mM KCl, lOmM 
Tris-HCl (pH9-0), 0.1% TritonX-100, L5mM MgCy (HT Biotechnology Ltd, Cambridge 

UK), 

Washed cells are added directly to a PCR reaction mix (2|ii per 30^1 reaction volume) 

5 comprising template plasmid (2Qng), primers 4 and 5 (1 pM each), dNTPs (0.2SmM), 1 X 
SuperTaq polymerase buffer, and overlaid vnUi mineral oil. Reactions are incubated for 10 
mm at 94 to release Tag pol from the cells and then themio<^cled with 30 cycles of the 
proffle 94 (1 mm), 55 (1 min), 72 ^'C (2min). 

Example 3. Emulsification of Amplification Reactions 

10 Emulsification of reactions is carried out as follows. 200^1 of PCR reaction mix (Tag 

expression plasmid (200ng), primers 3 and 4 (IpM each ), dNTPs (0-25mM), Taq polymerase 
(10 units)) is added dropwise (12 drops/min) to the oil phase (mineral oil (Sigma)) in the 
presence of 4.5% (v/v) Span 80 (Fluka), 0.4% (v/v) Tween 80 (Sigma) and 0.05% (vA^) Triton 
XlOO (Sigma) under constant stirring (lOOOrpm) in 2ml round bottom biofieeze vials (Costar, 

15 Cambridge MA). After complete addition of the aqueous phase, stirring is continued for a 
further 4 minutes. Emulsified mixtures are then transferred to 0.5 ml Hun-walled PCR tubes 
(lOOfil/tube) and PCR carried out ush^ 25 cycles of the profile 94 **C (1 mm), 60 ''C (1 min), 
72 ^C (3mm) after an initial 5 min incubation at 94 ""C. Reaction mixtures axe recovered by 
the addition of a double volume of ether, vortexing and centrifiigation for 2 minutes prior to 

20 removal of the ether phase. Amplified product is visualised on by gel electrophoresis on 
agarose gels using standard methods (see for example J. Sambrook, E« F« Fritsch, and T. 
Maniatis, 1989, Molecular Clonmg: A Laboratory Manual, Second Edition, Books 1-3, Cold 
Spring Harbor Laboratory Press). 
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For emulsification of whole cells expressing Taq polymerase, tiie protocol is modified in the 
following way: Taq expression plasmid and Taq polymerase in the reaction cocktail are 

a 

omitted and instead 5x10 induced Kcoli TQl cells (harbouring the expressed Taq 
polymerase as well as the expression plasmid) are added together with the additive 
5 tetramefhyl ammonium chloride (SO ^M), and KNAse (0.05% w/v, Roche, UK). The number 
of PGR cycles is also reduced to 20. 

Example 4« Self-Replication of tiie FuU-Length wt Taq gene 

la order to test genotype-phenolype linkage during self-replication, we mbced cells 
expressing either wild type Taq polymerase (wt Taq) or the poorly active (under &e buffer 

10 conditions) Stoffel fi-agmrat (sf Taq) ( F. C. Lawyer, et a/., PCR Methods Appl 2, 275-87 
(1993)) at a 1:1 ratio and subjected them to CSR either in solution or in emulsioa In solution 
the smaller sf Taq is amplified preferentially. However, m emulsion tiiere is almost exclusive 
self-replication of the fiill-lengdi wt Taq gene (Figure 3B). The numb^ of bacterial* cells is 
adjusted such that Ihe majority of emulsion compartments contain only a singjie cell. However, 

IS because cells are distributed randomly among compartments, it is unavoidable that a mincnr 
fraction wiU contain two or more cells. As compartments do not i^pear to exchange template 
DNA (Figure 3 A), the small amount of sf Taq amplification in emulsion is likely to originate 
fiom these compartments. Clearly, thek abundance is low and, as such, unlikely to affect 
selefctions. Indeed, in a test selection, a single round of CSR is sufticient to isolate wt Taq 

20 clones fix)m a 1 0^-fold excess of an inactive Taq mutant. 

Using error-prone PCR, we prepared two repertoires of random Taq mutants (LI (J. P. 
Vartanian, M. Hemy, S. Wain-Hobson, Nucleic Acid Res, 24, 2627-2631 (1996)) and L2 (M. 
Zaccolo, K Gherardi, J Mot Biol 285, 775-83 (1999).) Only 1-5% of LI or L2 clones are 
active, as judged by PCR, but a single round of CSR selection for polymerase activity under 
25 standard PCR conditions kicreased the proportion of active clones to 81% QA*) and 77% 
(L2*). 
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I&cample S. Mutagenic PGR 

Taq polymerase gene variants are constructed using two different methods of error- 
prone PGR. 

5 The first utilises the nucleoside analogues dPTP and dLTP (Zaccolo et al, (1996) J 

Mo! Biol 255, 589-603). Briefly, a 3-cycle PGR reaction comprising 50mM KCl, lOmM Tris- 
HCl (pH9,0), 0.1% TritonX-100, 2 mM MgG12, dNTPS (SOOpM), dPTP (500pM), dLTP 
(500nM), 1 pM template DNA, primers 8 and 9 (1 pM each), Taq polymerase (2.5 units) in a 
total volume of 50^1 is ciarried out with the thermal profile 94 ""C (1 min.), 55 °C (1 min.), 72 
10 (5 min), A 2\A aliquot is then transferred to a 100 |il standard PGR reaction comprising 
50mM KCl, lOmM Tris-HCl (pH9,0), 0.1% TritonX-100, 1.5 mM MgC12, dNTPS (250mM), 
primers 6 and 7 (1 pM each), Taq polymerase (2.5 units). This reaction is cycled 30 x wifli tiie 
profile 94 ""C (30 seconds), 55 •'C (30 seconds), 72 ^G (4 minutes). Amplified product is gel- 
purified, and cloned into pASK75 as above to create library L2. 

15 The second method utilises a combination of biased dNTPs and MnGh to introduce 

errors during PGR. The reaction mix comprises 50mM KCl, IQmM Tris-HGl (pH9.0), 0.1% 
TritonX-100, 2.5 mM MgGk, 0.3 mM MnGla, 1 pM template DNA, dTTP, dCTP, dGTP (all 
ImM), dATP (lOOpM) primers 8 and 9 (1 pM each) and Taq polymerase CIS units). This 
reaction is cycled 30 x witii flie profile 94 X (30 seconds), 55 X (30 seconds), 72 ''G (4 

20 minutes), and amplified products cloned as above to create library LI. 

Example 6. Selection Protocol 

For selection of active polymerases, PGR reactions within pulsions are carried out as 
described above but using primers 8, 9. For selection of variants with increased 
thermostability, emulsions are preincubated at 99 for up to 7 minutes prior to cycling as 
25 above. For selection of variants witii increased activity in the presence of the inhibitor 
heparin, the latter is added to concentrations of 0.08 and 0.16 units/fd and cycling carried out 
as above. Detailed protocols are set out in fiirther Examples below. 
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Amplification products resulting from compartments containing an active polymerase 
are extmcted from emulsion wilh ether as before and then pxuified by standard phenol- 
chlofororm extraction. 0.5 volumes of PEG/MgCli solution (30% v/v PEG 800, 30mM 
MgClj) is next added, and after mixing centrifugation carried out at 13,000 RPM for 10 
5 minutes at room temperature. The supernatant (containing unincorporated primers and 
dNTPs) is discarded and the pellet re-suspended in TE. Amplified products are then fatfker 
purified on spin-columns (Qiagen) to ensure conq)lete removal of primers. These products are 
then re-amplified using primers 6, 7 (^ch are externally nested to primers 8 and 9) m a 
standard PCR reactLon, with the exception that only 20 cycles are used. Rj&-anq)lified products 
10 are gel-purified and re-cloned into pASK7S as above. Transfonnants are plated and colonies 
screened as below. The remainder are scraped into 2xTY/0.1mg/ml ampicillm, diluted down 
to 00600=^-1 grown/induced as above for repetition of the selection protocol. 

Example 7. Colony Screening Protocol 

Colonies are picked into a 96 well culture dish (Costar), grown and induced for 
15 expression as above. For screening 2^1 of cells are used in a 30^1 PCR reaction to test for 
activity as above in a 96 well PCR plate (Costar) using primers 4 and 5. A temperature 
gradient block is med for the screening of selectants with increased thermostability. Reactions 
are preincubated for S minutes at temperatures ranging from 94.S to 99^C prior to standard 
cycling as above with primers 4 and 5 or 3 and 4. For screening of hepann-compatible 
20 polymerases, heparin is added to 0. 1 units/30^1 during the 96-well format colony PCR sateen. 
Active polymerases are then assayed in a range of hq)arin concentrations ranging &om 0.007 
to 3.75 units/30|il and compared to wildtype. 

Example 8. Assay for Catalytic Activity of Polymasses 

Kcat and Km (dTTP) are determmed using a homopolymeric substrate (Polesky et al., 
25 (1990) J. Biol. Chem 265:14579-91). The final reaction mix (25jil) conq)rises IX SnperTaq 
buffer (HT Biotech), poly(dA).oligo(dT)(500nM, Pharmacia), and variable concentrations of 
[a-^^P]dTTP (approx. 0.01 Ci/mmole). The reaction is initiated by addition of 5pl enzyme in 
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IX Siq)erTaq buffer to give a final enzyme concentrations between l-5nM. Reactions are 
incubated for 4 minutes at 72^*0, quenched with EDTA as in example 14, and appUed to 
24mm DE-81 filters. Filters are washed and activity measured as in example 14. Kinetic 
parameters are deteimined using the standard Lineweaver-Burke plot Experiments using 50% 
5 reduced homopolym^ substrate show no gross difference in incorporation of dTTP by 
polymerase^ indicating it is piesent in sufficient excess to validate the kinetic analysis protocol 
used 

Example 9. Standard PGR in Aqueous Cbmpartmenls Within an Emulsion 

To establish whether conditions in Uxo aqueous compartments present in an emulsian 
10 are pranissive for catalysis, a standard reaction mix is emulsified and PGR carried out This 
leads to amplification of the correct sized Tag polymerase gene present in the plasmid 
template, with yields sufScient yields to allow visualisation using standard agarose gel 
electrophoresis. 

Example 10. Bnulsification ofE coli ejqpressing JVzg Polymerase and Subsequent PGR to 
1 5 Amplify Polymerase Gene 

E, coli cells expressing Taq polymerase are emulsified and PGR carried out using 
primers flanking the polymerase cassette in the expression vector. Emulsification of iq) to 5 x 
10^ cells (per 600jil total volume) leads to discenoible product formation as judged by agarose 
gel electrophoresis. The ceils therefore segregate into the aqueous compartments where 
20 conditions are suitable for self-amplification of the polymerase gene by the expressed Taq 
polymerase. Similar emulsions are estimated to contain about 1 X 10^^ compartments per mi 
(Tawfik D. & Griffiths AD. (1998) Nature Biotech. 16, 652). The lai^e number of cells that 
can be emulsified allows for selection fix)m diverse repertoires of randomised proteixu 

Example 1 1 . Maintenance of Genotype-Phenotype Luikage in J^ulsion 



25 



To be viable for a selection method, the m^ority of aqueous cQnQ>artments in fhe 
emulsion should harbour a dng^ cell, and the integrity of compartments should be maint a ine d 
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during thermal cycling. This is tested by including in the emulsion cells harbouring a 
competitor template distinguishable by its smaller size. 

E colt expressing Tag polymerase are co-emulsified with K coli expressing the 
Stoflfel fiagment at a ratio of one to one. The Stoffel fragment is poorly active under the 
5 conditions used in emulsion, and thus amplification of its e3q)iession cassette by tiie same 
primer pair used for To^ self-^plification is tiie result of co-compartmentalisadon with a cell 
e3q>ress]ng active Taq polymerase or leakage of Taq polymerase between compartments. After 
PGR, the vast msgority of products arc found to conespond to Ite 

thus validating the premise of one cell per durable compartment (see Fig. 2, Ghadessy et id 
10 (2001), PNAS, 98, 4552). 

Example 12. Test Selection of Active over Inactive Taq polymerase 

To demonstrate that &e me&od can select for potentially rare variants, a 10^ fold 
excess of cells expressing inactive polymerase over those expressing &e active fcnm ate co- 
emulsified. After PGR and cloning of amplified product, a single expresdon screen udng a 96 
15 well format indicated a lO"^ fold enrichment for the active polymerase. 

Example 13. Directed Evolution of Taq Polymerase Variants with Increased Thermal Stability 

Polymerases with increased thermostability are of potential practical inq[)ortance, 
reducing activity loss during thermocycling and allowing higher denaturation temperatures fi>r 
the amplification of GC rich templates. Thus, we first used the selection method of our 

20 invention for the directed evolution of Taq variants with increased thermostability, starting 
from preselected libraries (LI*, L2*) and progressively increasing tiie temperature and 
duration of the initial thermal denaturation. After 3 rounds of selection, we isolated T8 (Table 
1), a Taq clone with an 1 l-fold longer half-life at 97.5°C tiian tiie aheady thermostable wt Taq 
enzyme (Table 2), making T8 the most tiiermostable member of the Pol I &mily on record 

25 (Clones are creened and marked by a VCSi assay. Briefly, 2^ of induced cells are added to 
30^1 PGR mix and amplification of a 0.4kb firegment is assayed under selection conditions 
(e.g. increasing amounts of heparin). Thermostability and heparin redstence of purified His 
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tagged wt and mutatnt Taq clones is determined as in (Lawyer rt al., PGR Methods Appl 2, 
275-287 (1993); Lawer et aL, J Biol Chem 264, 6427-37 (1989) using activated salmon sperm 
DNA and normalized enzyme concentrations). Mutations conferring tiiermostability to T8 
(and to a majority of less thermostable mutants) cluster in the 5'-3' exonuclease domain 

5 (Table 1). Indeed, truncation variants of Taq polymerase (F. C. Lawyer, et al., (1993) PCR 
Methods Appl 2, 275-87; W. M. Barnes, (1992) Gene 112, 29-35) lacking the exonuclease 
domain show improved thermostability, suggesting it may be less thermostable than the main 
polymerase domain. The lower thermostability of the exonuclease domain may have 
functional significance (for example reflectmg a need for greater flexibility), as tiie stabilizing 

10 mutations in T8 q}pear to reduce exonuclease activity (q)prox. 5-fold) (5'-3' exonculease 
activity is determined essentially as in (Y. Xu, et al., JMol Biol 268, 284-302 (1997)) but m 
\xTaq buffer with 025mM dNTP's and the 22-mer oligonucleotide of (Y. Xu, et al., J Mol 
Biol 268, 284-302 (1997)) 5' labelled with Cy5 (Amersbam). Steady-state kinetics are 
measured as in (A. H. Poleslqr, T. A. Steitz, N. D. Giindley, C. M. Joyce, J Biol Chem 265, 

15 14579-91 (1990) using the homopolymmc substrate poly(dA)2oo (Pharmacia) and oligo(dT)4(^ 
primer at 50^C.) (at least at low tCTq)arature). 



Itouod 


2Vigr variant 


Thenno- 
stability* 


Heparin 

Resistance* 




Taq^ 


1 


1 


1 


T646 (046V, A109P, F285L) 


2x 


n.d. 




T788 rF73S- R205K. K219E. M236T. A608V> 


4x 


n.d. 


2 


T9(F278L,P298S) 


4x 


iLd. 




T13 (R205K. BC219E. M236T. A608Vy 


7x 


n.d. 


3 


T8 flP73S. R205K. K219E. M236T. E434D. A608V1 


llx 


<0.5x 


1 


H32 (E9K, P93S, K340E, Q534R, T539A, V703A, 
R778K) 


n.d. 


8x 


2 


H94 fK225E. L294P. A454S, L461R. D5780, N583S) 


ad. 


32x 


3 


H15 (K225E. E388V. K540IL 05780. N583S. M747R> 


0.3x 


130x 
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* as judged by PGR (relative to Taq^, at 97.5^C 
** as judged by PGR (relative to Taq^ 

Table 1 : Properties of selected clones. Glones in bold are related through underUaed 
5 mutations. Glones are ranked in relation to wt Tag. 

Two libraries of Tag poIyni^:ase variants generated using etror-pione PGR are 
expressed in E. coli Oiibraiy Ll» 8x10^ clones, libraiy L2 2x10^ clones; see example 5) and 
emulsdfied as before. The first round of PGR is carried out to enrich for active variants using 
10 the standard Taq polymerase thermocycling profile outlined above. Enriched amplification 
products are purified, and recloned to generate libraries comprising of active variants (LI*, 
12*1 approx 10^ clones for each library). A screen of tiie LI"" and 12* libraries respectively 
showed 81% and 77% of randomly picked clones to be active. 

Selective pressure is applied to tiie LI* and 12* libraries during the next round of 
15 PGR by pre-incubating emulsions at 99^ for 6 or 7 minutes prior to the normal PGR cycle. 
Under these conditions, the wild type Taq polymerase loses all activity. Amplified products 
are enriched and cloned as above and a 96-weil expression screen used to select for active 
variants under normal PGR conditions. This yielded 7 clones form the 12* library and 10 
clones firom the LI* Ubrary. Ihese are tiien screened for increased thermostability using a 
20 temperature gradient PGR block, with a S minute pre-incubation at temperatures of 94.S to 
99°G prior to standard cycling. As judged by gel electrophoresis, 5 clones fix)m each library 
are present with increased thermostability compared to wild type. These mutants are able to 
efficientiy amplify the 320 b.p. target after pre-incubation at 99^C for 5 minutes. The wild 
type ens^e has no discernible activity after pre-incubation at ten9>eratures above 97^C for S 
25 minutes or loiigar. 



wo 02/22869 



PCT/GBOl/04108 



67 

Example 14. Assay for Thermal Stability of Polymerase 

Themial inactivation assays of WT and purified IBs-tagged polymerases are carried 
out in a standard 50 pi PGR mixture comprising IX SuperTag buffer (HT Biotech), 0.5ng 
plasmid DNA template, 200pM each of dATP, dTTP, and dGTP, primers 3 and 4 (lOjiM), 

5 and polymerase (E^pproxirnately 5nN^), Reaction mixtures ^ 

at 97.5^C, wilh Sjil aliquots being removed and stored on ice after defined intervals. These 
aliquots are assayed m a SOjil activity reaction buflfer comprisinjg 2SmM N- 
trisPiydroxymethyl-3-amino-propanesulfonic acid (TAPSXpH9.5), 1 mM p-merc^toefhanoU 
2mM MgC12, 200pM each dATP, dTTP, and dGTP, 100MM[a-^^P]dCTP (0,05 Ci/mmole), 

10 and 250 pg/ml activated sahnon sperm DNA template. Reactions are incubated for 1 0 mmules 
at 72''C, stopped by addition of EDTA (25mM final). Reaction volumes are made up to 500^1 
with solution S (2mM EDTA, 50ug/ml sheared sahnon sperm DNA) and 500pl 20% TCA 
(vAr) / 2% sodium pyrophosphate (v/v) added. After 20 minutes incubation on ice, reactions 
are applied to 24mm GF/C filters (Whatman). Unincorporated nucleotides are removed by 3 

15 washes with 5% TCA(v/v), 2% sodium pyrophosphate (v/v) followed by two washes with 
96% ethanol (v/v). Dried filters are counted m scintillation viak containing Ecoscuit A 
(National Diagnostics). The assay is calibrated using a known amount of Ae labeled dCTP 
solution (omittmg the wa^es). 

Example 15. Dhected Evolution of Tag Polymerase Variants with Increased Activity in the 
20 Presence of the Inhibitor Heparin 

As indicated above, the methods of our invention can also be used to evolve resistance . 
to an mhibitor of emgonatic activity. Heparin is a widely used anticoagulant, but also a pot^ 
inhibitor of polymerase activity, creating difiBculties for PCR amplifications &om clinical 
blood samples (J. Satsangi, D. P. Jewell, K. Welsh, M. Bunce, J. L Bell, Lancet 343, 1509-10 
25 (1994)). While heparin can be removed fix)m blood samples by various procedures, these can 
be both costiy and time-consuming. The availability of a heparin-compatable polymo^ 
would therefore greatly improve characterisation of thempeutically significant amplicons, and 
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obviate the need for possibly cost-prohibitive hq)arinase treatment of samples (Taylor A.C. 
il997)MolEcol6,3S3). 

The LI* and 12* libraries are combined, and selected in emulsion for polymerases 
active in up to 0.16 units hq)arin per id. After a single round, 5 active clones are isolated in 
5 die 96 well PCRsoeen incorporating 0.1 umts/30^1 reaction, with the ivild type showing no 
activity. Titration shows diat 4 of these clones to be active in tqp to four times the amount of 
heparin mhibiting wild type (0.06units^30^1 versus 0.015units/30pl). Hie o&er clone is active 
m up to dgiht tunes fbe amount of heparin inhibiting wild type (0.12umts/30^1 versus 
0.015units/30Ml). 

10 Using selection in the presence of increasing amounts of heparin, we isolated HIS, a Tag 

variant functional in PGR at up to 130-times the inhibitory concentration of heparin (Table 2). 
Intriguingly, heparin resistance conferring mutations also cluster, in this case in the base of tiie 
finger and thumb polymerase subdomains, regions involved in binding duplex DNA. hideed, 
judgmg firom a recent high-resolution structure of a Taq-DNA complex (Y. Li, S. Korolev, G. 

15 Waksman. EMBOJll, 7514-25 (1998)) four out of six residues mutated in H15 (K540, D578, 
N583, M747) dkectiy contact dtiier template or product strand (as shown m Figure 7). H15 
mutations appear to be neutral (or mutually compensating) as &r as afiBnity for duplex DNA is 
concerned (while presumably reducing affinity for heparin) (T^kA^ 2) (Ko for DNA is 
determmed using BIAcore. Brief^, the 68-mer used in (M. Astatke, N. D. Qrindley, C M 

20 Joyce, J Biol Chem 270, 1945-54 (1995)) is biotinylated at the 5' end and bound to a SA 
sensorchip and bindiag of polymerases is measured in Ix Taq buf^ (see above) at 20**C. 
Relative Ko values are estimated by the PGR ranking assay using decreasing amounts of 
template). The precise molecular basis of heparin inhibition is not known, but our results 
strongly suggest overlapping (and presumably mutually exclusive) binding sites for DNA and 

25 heparin in the polymerase active site, lending support to the notion that heparin exerts its 
inhibitory effect by mimicking and conqieting with duplex DNA fin: binding to tiie active site. 
Our observation that heparin inhibition is markedly reduced under conditions of excess tenq[)late 
DNA, (see (Clones are screened and ranked by a PGR assay. Briefly, 2nl of induced cells are 
added to 30jd PGR mix and amplification of a 0.41d) fiagment is assayed under selection 

30 conditions (e,g, increasing amounts of heparin). Thermostability and heparin resistance of 
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purified Ifis tagged wt and mutant Tag clones is determined as in (F. C. Lawyer, et al^ PCR 
Methods Appl 2, 275-87 (1993); F. C, Lawyer, et al„ J Biol Chem 264. 6427-37 (1989)) using 
activated salmon sperm DNA and normalized enzyme concentrations, Table 2) s^xpears 
consistent vnHh this hypo&esis. 

5 



Table 2: Properties of selected Taq clones 



Taq 
clone 


Tu2(97.5"C) 
(mm) 


Heparin 

resistance 

(un^ts^ll) 




(sb 


dTTP 

OiM) 


5'-3' 

exo 

activity 


Mutation 
Rate* 


Taq* 


n.d. 




0.6 


0.8^ 


4.0* 


43.2 


n.d. 


1.1 




1,5** 


90 


0.6*** 


0.8 


9.0 


45.0 


1 


1 


T8 


16.5" 


n.d. 


0.3*** 


12 


8.8 


48.6 


oa 


1.2 


HIS 


0.3" 


1750* 

4 


84*" 


0.79 


6.8 


47.2 


IJ 


0.9 



* conunercial Taq preparation (HT BiotechnoiogyX ** with N-terminal His$ tag, measured by CTP^ 
mcorporation into salmon sperm DNA, *** no tag, measured by PCR assay, ^ Taq^ published yahie: InNT 
10 ^ (7), Klenow (Cambio), 4iM\ ^ Exoli DNA Pol I, published value: 3.8 s'' (A, H. Polesky, T. A. Steilz, 
N. D. Grindley, C. M Joyce, J Biol Chem 265, 14579-91 (1990)), * in relation to Taq^ measured by 
mots EUSA (Graecheck) (P. Debbie, et aL, Nudeie Acids Res 25, 4825-4829 (1997).), Pfii 
(Stratagene): 0.2. 



15 Example 16: Template evolution in ^nulsion selelction 

A classic outcome of in vitro replication experiments is an adqrtation of the template sequence 
towards more rapid replication (S. Spiegelman, Q. Rev. Biophys. 4, 213-253 (1971)). Indeed, we 
also observe teoq>late evolution fbroui^ silent mutations* Unlike the coding mutations (AT to 
GC vs. GC to AT / 29 vs, 1 6), non-coding mutations display a striking bias (AT to GC vs. GC 

20 to AT / 0 vs. 42) towards decreased GC content, g^erally tfaou^ to promote more effident 
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leplication by £adlitating strand sepaiation and destabilizing secondaiy structures. Apsai fiom 
selecting for adaptation, our mediod may also select for adqitability; i.e. polymerases mi^t 
evolve towards an optimal, presumably higher, rate of self-mutation (Ni. Eigen, 
Naturwissenschqften 58, 465-523 (1971)). Indeed, mutators can arise spontaneously in as^aud 
5 bacterial populations under adaptive stress Taddei, et aL, Nature 387, 700-2 (1997); P. D. 
Sniegowdd, P. L Oenrish, R. E. Lenski, Nature 387» 703-5 (1997)). By analogy, it could be 
argued tbat our method might favour polymerase variants that are more enor-prone and hence 
cq>able of faster adaptive evoluticm. Hoivever, none of the selected polymerases disg^yed 
increased error rates (Table 2). Eliminating recombmation and decreadng the mutational load 
10 during our method cycle may increase selective pressures towards more error-prone enzymes. 

Example 17 Assay for Eteparin Tolerance of Polymerases 

Heparin tolerance of polymerases is assayed using a similar assay to tiiat for thermal 
stability. Heparin is serially diltited into the activity buffer (0-320 units/45|il) and 5^1 of 
enzyme in the standard PGR mixture above are added Reactions are incubated and 
1 5 incorporation assayed as above. 

Example 18. Selection for Tag Variants with Inc^ased Ability to Extend from a 3* 
Mismatched Base . 

Tbe primers used are Primer 9 (LMB388ba5WA) and Primer 10 (8fo2WC). This 
primer combination presents polymerase variants with a 3' purine-purine mismatch (A-G), 
20 and a 3' pyrimidine-pyrimidine mismatch (C-C). These are the mismatdies least tolerated by 
Taq polymerase (Huang et al., 1992, Nucleic Acids Res 20(17):4567-73) and are poorly 
extended. 

The selection protocol is essentially the same as before, except that tiiese two primers 
are used in emulsioa Extension time is also increased to 8 minutes. After two rounds of 
25 selection, 7 clones are isolated which display up to a 16-fold increase in extension off the 
mismatch as judged by a PGR ranking assay (see example 2: using primers 5 and 11) and 
standardised for activity using the normal primer pair. Tliese clones are subsequentiy shuffled 
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back into the original Ll"^ and L2* libraries along vdth type Taq and Hie selection 
process repeated, albeit with a lower number of cycles (10) during die CSR reaction. This 
round of selection yielded numerous clones, the best of ^ch displayed up to 32-fold increase 
in mismatch extension as judged by PGR (see example 2) using primers 5 and 1 1 . 

5 Incorporation of an incorrect base pair by Taq polymerase can stall the polymerisation process 
as certain mismatches (see above) are poorly extended by Taq. As such, Taq polymerase alone 
cannot be used in the amplification of large (>6Kb) templates (Barnes). This problem can be 
overcome by supplementing Taq with a polymerase that has a 3*-5' exonuclease activity (eg 
Pfii polymerase) that removes incorrectiy incorporated bases and allows resumption of 

10 polymerisation by Taq. The clones above are therefore investigated for tiieir ability to cany 
out amplification of large DNA firagments (long-distance PCR) from a lambda DNA template^ 
as incorportion of an incorrect base would not be e3q)ected to stall polymerisation. Using 
primers 12 (LBA23) and 13 (LF046) (luM each) in a 50ul PCR reaction contammg 3ng 
lambda DNA (New England Biolabs) dNTPs ( 0.2 mM), Ix PCR buflfer (HT Biotech) clone 

15 Ml is able to amplify a 23Kb firagment usmg 20 repetitions of a 2-step amplification cycle (94 
^C, IS seconds; 68 ^C, 25 mmutes). Wild type polymerase is unable to extend products above 
13 Kb usmg the same reaction buffer. Commerical Taq (Perkin Elmer) could not extend 
beyond 6 Kb using buffer supplied by the manufacturer. 

Exan^le 19 Selection Usmg Self-Sustained Sequence Replication (3SR) 

20 To demonstrate the feasibility of 3SR wilfam emulsion, the Taq polymerase grae is 

first PCR-amplified from the parent plasmid (see example 1) usmg a forward primer that is 
designed to incorporate a T7 RNA polymerase promoter mto the PCR product A 250 ^1 3SR 
reacion mix comprising the modified Taq gene (50ng), 180 units T7 RNA polymerase (USB, 
63 units reverse transcriptase (HT Biotech), rNTPs (12.5mM), dNTPs (Iml^, MgCt 

25 (lOmM), primer Taqba2T7 (primer 12; 125pmoles), primer 88fo2 (primer 4; 125pmoles), 
25mM Tris-HCl (pH 8.3), 5QmM KCl, and 2.0mM DTT is made. 200^1 of tiiis is emulsified 
using the standard protocol. Afier prolonged incubation at room temperature, amplification of 
the Taq gene (representing a model gene sbse) within emulsion is seen to take place as judged 
by standard gel'-electrophoresis. 
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To further e}q>and the scope of &e metiiod, fhe 3SR reaction is carried out in an int- 
vitro transcription/translation esdract (EcoPro, Novagen). The inactive taq gene (see example 
1) is amplified firom parental plasmid usmg primers 2 (TaqfoSal) and 12 (Taqba2T7). lOOng 
(approx. 1x10^® copies) is added to make up lOOul of the aqueous phase comprising EcoPro 
5 extract (70ul), methionine (4ul), reverse transcriptase (84 units, HT Biotech), primer 12 
(Taqba2T7,2uM), primer 13 (TaqfoLMB2, 2uM), dNTPs (250uM). The aqueous phase is 
emulsified into 400ul oil-phase using the standard protocol. After incubation at 37 
overnight, the emulsion is extracted using the standard protocol and the aqueous phase fiirther 
purified using a PCR-purification column (Qiagm). Complete removal of primers is ensured 

10 treating 5ul of column eluale with 2(j1 ExoZap reagwt (Stratagene). DNA produced in 
emulsion by 3SR is rescued by udng 2|xl of treated treated column ehiate in an otherwise 
standard SOul PGR reaction usmg 20 (^cles of amplification and primers 6 (LMB, ref 2) and 
12 (7b9ba2T7). Compared to background (tiie control reaction where reverse transcriptase is 
omitted firom the 3 SR reaction in emulsion), a more intense correctiy sized band could be sera 

15 when products are visualised using agarose gel electrophoresis. Hie 3SR reaction can 
therefore proceed in fhe transcription/tranlsation extracts, allowing for the directed evolution 
of agents expressed in aqueous compartments. 

WT Taq polymerase has limited reverse transcriptase activity (Perler et al., (1996) Adv 
20 Protein Chem. 48, 377-435). It is also known that reverse transcriptases (eg HIV reverse 
transcriptase that has both reverse transcriptase and polymerase activites) are considerably 
more error prone than other polymerases. This raises the possibility that a more error-prone 
polymerase (where increased tolerance for non-cognate substrate is evident) might display 
ina:eased reverse transcriptase activity. The genes for Taq variants Ml, M4 as well as tiie 
25 inactive mutant are amplified firom parental plasmids using primers 12 (Taqba2T7) and 2 
(TaqfoSal) and the 3SR reaction is carried out as above in the transcription/translation extract 
^ovagen) with the exception that reverse transcriptase is not exogenously added, hi control 
reactions, metfaionme is omitted fix>m the reaction mix. Afl^ 3 hours incubation at 37^C, the 
reaction is treated as above and PCR carried out using primer pair 6 and 1 2 to rescue products 
30 synthesised during the 3SR reaction. Of the clones tested, clone M4 gave a more intense 
conectiy sized band conq)ared to control reaction v^en products are visuaUsedi^ 
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gel electrophoresis. Clone M4 would therefore t^pear to possess some degree of reverse 
transcriptase activity. This result shows that it is possible to express functionally active 
rq>licases in vitro. When coupled to selection by compartmentalisation, novel replicases could 
be evolved. 

S Selection of Agents ModifymgRepUcase Activity 

Example 19 and the foUowiog Examples describes how the methods of our invention 
may be employed to select an en2yme vMch is involved in a metabolic pathway whose final 
product is a substmte for the replicase. These Examples show a method for selection of 
nucleoside diphosphate kinase (NDP Kinase), vMch catalyses the transfer of a phosphate 

10 group jfrom ATP to a deoxynucleoside diphosphate to produce a deoxynucleoside triphosphate 
(dNTP). Here, the selectable en2yme (NDK) provides substrates for Tag polymerase to 
amplify the gene encoding it This selection method differs fix)m the compartmentalized self- 
replication of a replicase (CSR, Ghadessy and Holliger) in that replication is a cot5)led 
process, allowing for selection of enzymes (nucleic acids and protein) that are not replicases 

15 themselves. Bacteria expressing NDK (and containing its gene on an expression vector) are 
co-emulsified with its substmte (m this case, dNDPs and ATP) along with tiie other regents 
needed to facilitate its amplification (Tag polymerase, primers specific for the ndk gene, and 
buffer). Compartmentalization in a water-in-oil emulsion ensures the segregation of 
individual library variants. Active clones provide the dNTPs necessary for Tag polymerase to 

20 amplify the ndk gene. Variants with increased activity provide more substrate for its own 
amplification and hence post-selection copy number correlates to enzymatic activity within 
the constraints of polymerase activity. Additional selective pressure arises from the munmum 
amount of dNTPs required for polymerase activity, hence clones with increased catafytic 
activity are amplified preferehtially at tiie expense of poorly active variants (selection is for 

25 kcataswellasKm). 

By showing tliat we can evolve an enzyme whose pioduct feeds into the polymerase 
reaction, we hope to eventually co-evolve multiple enzymes linked through a pafhwc^ where 
one enzyme's product is substrate for the next Diversity could be introduced into two or 
more genes, and both genes could be co-transformed into tiie same expsesdon host on 
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plasmids or phage. We hope to develop cooperative emymQ systems that enable selection for 
the synthesis of unnatural substrates and their subsequent incorporation into DNA. 

Bxan^le 20 Induced Esqnession of NDP Kinase in Bacterial Cells 

A pUC19 expression plasmid containing the EcoKUHindSL restriction fiagment ^tb 
S the open reading fiame of Nucleoside Diphosphate Kinase fiom Myxococcus Xantiius is 
cloned. Plasmid is prepared fit>m an overnight culture and transformed into tiie ndk-, pykA-, 
pykF- stram of J£ colt QL1387. An ovemi^t culture of QL1387/pUC19ndk is grown in tiie 
presence of chloramphenicol (10 ^g/ml final concentration), ampicillin (100 ^g/ml final 
concentration) and glucose (2%) for 14-18 hours. The overnight culture is diluted 1:100 in 

10 (2XTY, 10 M-g/ml chloramphenicol, 100 pg/ml ampicillin and 0,1% glucose). Cells are grown 
to an 0,D. (600 nm) of 0.4 and induced with IPTG (ImM final concentration) for 4 hours at 
3TC. After protein induction, cells are washed once in SuperTaq buffer (10 mM tris-HCL pH 
9, 50 mM KCl, 0.1% Triton X-100, 1.5 mM MgC12, HT Biotechnology) and resuq)ended m 
1/10 volume of the same buffer. The number of cells is quantified by spectrophotometric 

15 analysis with tiie ^roximation of 0 D.600 0.1 ^ 1x10^ cells/ml. 

Example 21 Phosphoryl Transfer Reaction m Aqueous Compartments Within an Bnuldon 

To establish whether deoxynucleoside diphosphates can be phosphorylated by NDP 
kinase in Taq buffer, a standard PCR reaction is carried out in which dNTPs are replaced by 
dNDPs and ATP, a donor phosphate molecule. Nucleoside diphosfdiate kinase is expressed 
20 fix)m £1 coli QL1387 (a ndk and pyruvate kmase defici^t strain of £ coli) as described in the 
previous example. Cells are mixed with the PCR reaction mix. 

Washed cells are added to a PCR reaction misture (approx. 8e5 cells/pl final 
concentration) containing SuperTaq buffCT, 0.5 fiM primers, 100 pM each dNDP, 400 pM 
ATP, Stq>erTaq polymerase (0.1 unit/pl final concentration, HT Biotechnology). 
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After breaking open Ifae cells at 65 for 10 min, incubating the reaction nuxture for 
10 minutes at 37 and thermocycling (15 cycles of 94 15sec, 55 30 sec, 72 ""C 
lmin30sec), amplified products are visualized on a standard 1.5% agarose/TBE gel stained 
with ethidixim bromide (Sambrook). The results of this experiment show that expressed NDP 
5 kmase can phosphoiylate dNDPs to provide Taq polymerase with substrates for the PGR 
amplification of the ndk gene. 

The experiment is repeated, with the additional step of emulsifying the reaction 
nuxture with mineral oil and detergwt as described above. It is found that NDP kinase is 
active wifliin aqueous compartments of an guidon 

10 l&cample 22. Compartmentalization of >n>K Variants by Emulsification 

The origmal emulsion mix allowed for the difibsion of small molecules between 
compartments during titermo^^cling. However, by adjusting the water to oil ratio and 
minimizing the thennocycling profile, the exchange of product and substrate between 
15 compartments is nadmmized,resultmg in a tighter linkage Giventhe 
division rates can be controlled by modifying the emulsion mix, it be possible to adjust 
bufi^ conditions afier emuMfication, possibly allowing for greats control of sdection 
conditions (l.e. adjusting pH with the addition of acid or base, or starting/stopping reactions 
with the addition of substrates or ihhilntors). 

20 150 jJ of PGR reaction mix {Stg>erTaq buffer, 0.5 pM each primer, 100 pM each 

dNDP, 400 nM ATP, 0.1 unit/^U Taq polymerase, 8x10^ cells/jil of QL1387/ndk) are added 
dropwise (1 dn>p/5 sec) to 450 oil phase (mineral oil ) in the presence of 4.5% vA^ Span 80, 
0.4% v/v Tween 80 and 0.05% v/v Triton X-100 under constant stirring in a 2 ml round 
bottom biofi^eeze vial (Goming) . After addition of &e aqueous phase, stirring is continued for 

25 an additional 5 minutes. Emulsion reactions are aliquoted (100 |xl) into thin-walled PGR tubes 
and tfaeimocycled as indicated above. 
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Recoveiy of amplified pgroducts after ^ulsification is carried out as follows. After 
ihexmocycling, products are recovered by extraction vnUi 2 volumes of diethyl ether, vortexed, 
and centrifiiged for 10 minutes in a tabletop microfuge. Amplification products are analyzed 
as before. 

S Example 23. Minimizing Background Kinase Activity 

Background kinase activity levels are determined by emulsifying E. coli TGI cells in 
Taq buffer with substrates, as described above. It is found that native nucleoside dq)hosphate 
kinase from E. coli retained enoiigh activity after the initial denaturation to provide significant 
kinase activity in our assay. The pUC19 expression plasmid containing the ndk gene is 
10 transformed into a ndk deficient strain of £ coli QL1387. Compared to a catalytic knockout 
mutant of mx ndk (HI ITA), the background kinase activity is determined to be negligible in 
our assay (amplified products could not be visualized by agarose gel electrophoresis) vWien 
ndk is expressed from the knockout straiiL 

Example 24. Maintenance of the Genotype-Phenotype Linkage in Emulsioa 

IS A catalytic knockout mutation (NDK H117A) of NDP kinase is co-emulsified vn&i 

i7vild-1ype NDP kinase in equal amounts. The inactive mutant of ndk is distinguished by a 
smaller amplification product, since the 5* and 3^ regions flanking tiie ORF downstream fiom 
the primmg sites are removed during construction of the knockout mutant Our ^ulsification 
procedure gives complete bias towards amplification of the active kinase, as det^mined by 

20 agarose gel electrophoresis. 

Example 25: Method fi>r the parallel genotypingofheterogenous populations of cells. 

The ^proach involves compartmentation of the cells in question in the emulsion (see 
WO9303151) together with PGR reagents etc. and polymerase. However, instead of linking 
genes derived fiiom one cell by PGR assembly, one (or several) biotinylated primers are used 
25 as well as a streptavidin coated polystyrene beads (or any oth^ suitable means of linking 
primers onto beads). Thus, PGR fi:agments fix>m one single cell are transferred to a single 
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bead. Beads are pooled, interrogated for presence of a certain mutation or allele using 
fluorescently labelled probes (as described for "Digital PGR*') and counted by FACS. 
Multiplex PGR allows the simultaneous interrogation of 10 or maybe more markers. Single 
beads can also be sorted for sequencing. 

S Applications include, for example, diagnosis of asymptomatic tumors, vAdch hinge on 

the detection of a very small number of mutant cells in a large excess of normal cells. The 
advantage of this mediod over cytostainiug is through-puL Potentially 10^-1(^ cells can be 
intenogated simultaneous]^. 

Example 25: short-patch CSR 

10 

The present example relates to the selection of polymerases with low catalytic activity 
or processivity. Compartmentalized Self-Replication (CSR), as described, is a mediod of 
selecting polymerase variants with increased adaptation to distinct selection conditions. 
Mutants witii increased catalytic activity have a selective advantage over ones Ifaat are less 

IS active under the selection conditions. However, for many selection objectives (e.g. altered 
substrate specificity) it is likely (hat intotmediates along the evolutionary pafhwe^ to the new 
phenotype will have lowered catalytic activity. For example, &om kinetic studies of R coU 
DNA polymerase I, mutations such as E710A increased afBnity and incorporation of 
ribonucleotides at the expense of lo w^ catalytic rates and less aflBnity for wild-type substrates 

20 (deoxyribonucleotides) (F. B. Perler, S. Kumar, H. Kong, Acb. in Prot Chem. 48, 377-430 
(1996)). The corresponding mutant of Taq DNA polymerase I, E61SA, could incorporate 
ribonucleotides into PCR products more efficientiy than wild-type polymerase. Howev^, 
using wild-type substrates, it is only able to synthesize short fragments and not the full-length 
Taq gene, as axialyzed by agarose gel electrophoresis. Therefore it would be difiGcult to select 

25 for tiiis mutation by CSR. In anotiio- selection experiment in which Beta-glucuronidase is 
evolved into a p-galactosidase, the desired phenotype is obtained after several rounds of 
selection but at the expense of catalytic activity. It is also found that selected variants in the 
initial rounds of selection are able to catalyze the conv^ion of several different substrates not 
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Utilized by eillier parental enzyme, and at much lower catalytic rates (T. A. Steitz, J Biol 
Ctem 274, 17395-8 (1999)). 

In order to address the problem of being able to select polymerase variants vdtfa low 
catalytic activity or processivity such as may occur along an evolutionaiy tr^ectory to a 

5 desired phenotype, a variant of CSR, in which only a small region (a "patctfO of tiici gene 
under investigation is randomized and replicated, is employed. The technique is referred to as 
"short-patch CSR" (fii)CSR). q>C!SR aUows fiar less active or processive polymerases to still 
become enriched during a round of selection by decreasing the selective advantage given to 
highly active or processive mutants. This mediod e3q)ands on the previously described metiiod 

10 of compartmentalized self-rq)lication, but» because the entire gene is not replicated, tiie short 
patch metiiod is also useful for example for investigating specific domains independ^t of the 
rest of the protein. 

Therearemany ways to introduce localised diversity into a g^e, among these are error* 
pipne PCR (usmg manganese or syntiietic bases, as described above for tiie Taq polymerase 

15 library), DNA shufiQing (C. A. Brautigam, T. A. Steitz, Curr Opin Struct Biol 8, 54-63 (1998); 
Y. U, S. Korolev, G. Waiksman, EMBOJVJ, 7514-25 (1998) cassette mutagenesis (E. Bedford, 
S. Tabor, C C. Richaidson, Proc Natl Acad SciUSA 94, 479-84 (1997)), and degenerate 
oligonucleotide Erected mutag^iesis (Y. li, V. Mitaxov, G. Waksman, Proc Natl Acad Set US 
A 96, 9491-6 (1999); M. Suzuki, Basidn, L Hood, L. A. Loeb, Proc Natl Acad Sci USA 93, 

20 9670-5 (1996)) and its variants, e.g. sticky fiset mutageneas (J. L. Jestm, P. Kristensen, G. 
Winter, Angew. Chenu bit Ed 38, 1124-1127 (1999)), and random mutagenesis by v^4ioIe- 
plasmid amplification (T. Oberholzer, M. Alhrizio, P. L. Luisi, Chem Biol 2, 677-82 (1995))* 
Combinatorial alanine scanning (A. T. Haase, E. F. Retzel, K. A. Staskus, Proc Natl Acad Sci U 
SA 87, 4971-5 (1990)) may be used to generate library variants to determine which amino acid 

25 residues are functionally important 

StRictural (M J. E^leton, G. Gorochov, P. T. Jones, G. Winter, Nucleic Acids Res 20, 
3831-7 (1992)), sequmce alignment (D, S, Tawfik, A. D. Griffitiis, Nat. BiotechnoL 16, 652- 
656 (1998)), and biochemical data fiom DNA polymerase I studies reveal regions of tiie gene 
30 involved in nucleotide buiding and catalysis. Several possible le^ons to target include re^ons 
1 tiirough 6, as discussed in (D. S. Tawfik, A, D. Griffitiis, Nat. BiotechnoL 16, 652-656 (1998)) 



wo 02/22869 



PCT/GBOl/04108 



79 

(regions 3, 4, and S are also referred to as Motif A, B, and C, respectively, in Tag DNA 
poIym^Bse I). Other possible taigeted regions would be those regions conserved across sev^ 
diverse species, those implicated by structural data to contact die nucleotide substrate or to be 
involved in catalysis or in proximity^ to die active site, or any otiber region ino^ortant to 
5 polymerase function or substrate binding. 

During a round of selection, each library variant is requited to replicate only the region 
of diversity. This can be easily achieved by providing primers in a PGR reaction wliich flank 
the region diversified CSR selections would be done essentially as desmbed After CSR 
selection the short region which is diversified and replicated now is reintroduced into the 
10 starting gene (or another genetic framework e.g. a library of mutants of the parent gene, a 
related gene etc.) using either appropriately situated restriction sites or PGR recombination 
methods like PGR shufQLng or Quickchange mutagenesis etc. The spCSR cycle may be 
repeated many times and multiple regions could be targeted simultaneously or iteratively with 
flanking primers either amplifying individual regions separately or inclusively. 

15 To increase stringency in selections at a later stage spCSR is tunable simply by 

increasing the length of replicated sequence as defined by the flanking primers up to full 
length GSR. Indeed, for selection for processivity i.a. it may be beneficial to extend the 
replicated segment beyond the encoding gene to the \^diole vector using strategies analogous 
to iPC3R (inverted PGR). 

20 spGSR can have advantages over fiiU length GSR not only when lookmg for 

polymerase variants with low activities or processivities but also v/bm mapping discrete 
regions of a protein for mutability, e.g. in conjunction with combinatorial alanine scanning 
(A. T. Haase, E. F. Retzel, K. A. Staskus, Proc Natl Acad SciUSA 87, 4971-5 (1990)) to 
determine which amino acid residues are functionally important Such information may be 

25 useful at a later stage to guide semi-rational approaches, i.e. to target diversity to residues 
/regions not involved in core polymerase activity. Furthermore spGSR may be used to 
transplant polypeptide segments between polymerases (as witii immunoglobulin GDR 
grafting). A simple swap of segments may lead initially to poorly active polymerases because 
of steric clashes and may reqmre "reshaping" to integrate segments functionally. Reshaping 
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may be done using either full length CSR (e.g. fiom existing random mutant libraries) or 
spCSR targeted to secondary regions ^Vender zone" in antibodies). 

Short patches may also be located at either N-or C-tenninus as extensions to existing 
polymerase gene sequences or as internal insertions. Precedents for such phenoQpe modifying 
S extensions and insertions exist in nature. For example both a C-terminal extension of T5 DNA 
pol and the tfaioredoxin-binding insertion in T7 DNA pol are oitical for processivity in these 
enzymes and enable them to efficiently replicate the large (> 30kb) T-phage genomes. N-or C- 
terminal extensions have also been shown to enhance activity in other enzymes. 

Example 26: Low temperature CSR using Klenow fiagment 

10 

Klenow ficagment was cloned fix>m E.coli genomic DNA into expression vector 
pASK7S (as witti Taq) and expressed in Ecoli strdn DHSaZl ^utz R. & Bujard H. (1997), 
Nucleic Acids Res 25, 1203). Cells were washed and resuspended in lOmM Tris pH7.5. 
2x10^ resuspraded cells (20|a1) were added to 200|il low temperature PCR* buffer (LTP) 

IS (lakobasbvili, R. & Lapidot, A. (1999), Nucleic Acids Res., 27, 1566) and emulsified as 
described (Ghadessy et al .(2001), PNAS, 98, 4552). LTP was lOmM Tris (pH7.5), 5.5M 
proline, 15% w/v glycerol, 15mM MgC12 + suitable primers (because proline lowers melting 
temperature, primers need to be 40-mers or long^) and dNTP^s and emulsified as described 
Low temperature PCR cycling was 70''C lOmin, 50x (70 ^'C BOsec, 37 ^^C 12min). Aqueous 

20 phase was extracted as described and puried selelction products reamplifed as described 
(Ghadessy €* al., (2001) PNAS, 98, 4552). 

All publications mentioned in the above spedfication are herein incorporated by 
reference. Various modifications and variations of Ae described methods and system of tlie 

25 invention will be apparent to those skilled in the art without departmg from tiie scope and 
spirit of the invention. Although the invention has been described in connection with specific 
preferred embodiments, it should be understood that the invention as claimed should not be 
unduly limited to such specific embodiments. Indeed, various modifications of the described 
modes for carrying out the invention which are apparent to tiiose skilled in molecular biology 

30 or related fields are intended to be within the scope of the following claims. 
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Primer 


Designation 


Sequence (5' to 3') 


Primer 
1 


TaqbaXba 


GGCGACTCTAGATAACGAGGGCAAAAAATG 
CGTOGTATGCTrcCTCTTTTTGAGCCCAAGGG 


Primer 
2 


TaqfoSal 


GCGGTGCGGAGTCGACrCACrCCTTGGCGGA 
GAGCCAGTCCTC 


3 


88ba4 


AAAAATCTAGATAACGAGGGCAA 


. Primer 
4 


88fo2 


ACCACCGAACTGCGGGTGACGCCAAGCG 


Primer 
5 


Taqba(sar) 


GGGTACGTGGAGACCCTCTTCQGCC 


Primer 
6 


LMB2 


GTAAAACGACGGCCAGT 


Primer 
7 


LMB3 


CAGGAAACAGCTATGAC 


Primer 
8 


88ba4LMB3 


CAGGAAACAGCTATGACAAAAATCTAGATAA 
CGAGGGCAA 


Primer 
9 


88fo2LMB2 


GTAAAACGACGGCCAGTACCACCGAACTGCG 
GGTGACQCCAAGCG 


Primer 


LMB3881)aS 
WA 


GAG GAA ACA GOT ATG AC A AAA ATC TAG 
ATA ACG AGG GA (A>G mismatdi) 


Primer 
11 


8fo2WC 


GTA AAA CGA CGG CCA GTA CCA CCG AAC 
TGC GGG TGA CGC CAA GCC (C-C mismatdi) 
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Primer 
12 


LBA23 


GQAGTAQATGCTTGCTT TTCTGAGCC 


Primer 
13 


LF046 


GCTCTGGT TATCTGCATC ATCGTCTQCC 



Table 3. Primer sequences used in Examples 
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SEQUENCES 

Thennostable clone T7-88: Nucleotide secjuence 

AACCTTGGTATGCTTCCn€TTmGAGCCC^ 
5 CXTTACCGCAOnnKXIACHaCCCT 

ACGGCirCGCCAAGAGCCTCCTCAAGGOCCT 
ACGCCAAGGCCX:CCrccrCCCGCCACX3AGGCCT^ 
CGGAGGACTTTCCCCGGCAACTCGCCCTCATCAAGGAGCTGGTGGACCT 
TCGAGGTC(XGGGCrACGAGGCGGACGACGT(XTGGCCAGCXn'GGCC^ 
10 GGCTACGAGGT<XXK:ATarrcACH:GCCGACA^ 
TOjrCCACCCCXjAGGGGTACXrrcATCACX: 
Aa^AGTGGGO^GACrACdGGGCCCTGACXXK^ 

TCGGGGAGAAGACGGCGAAGAAGCTTCTGQAGGAGTGGGGGAGCCrGGAAGC^^ 
CTGGACCGGCTGAAGCCCGCCATCCGGGAGAAGATCCTGGCCCACACGGACGATCTGAAC^ 
15 TGGGACCTGGa:AAGGTGCGCACCXjA(XTGCCCX:TGGAGG 
GACCGGGAGAGGCTTAGGGCCirrCrGGAGAGGOT 

cttctggaaagcccxiaaggcccrrggaggaggccccc^ 
tttgtccttixxxxk:^ 

cx3ggtccac0gggcccx:cqagccitataaagcc(^ 

20 GCCAAAGACXTGAGCGTTCTGGCCCTAAGGGAAGGCCITGGCCT 

CTCCTCG(XrrACCTCCTGGACCCTTCCAACACCACCX:CCGAGGGGGTGGCCCG^ 

gagtggacggaggaggcx3ggggagcgggcx:gccctttccgagaggctct^ 

gaggcttgagggggaggagaggctcctttggctt^ 

cctggccx:acatggaggccacgggggtgak^^ 

25 GGTCGCCGAGGAGATOKXXXiCUraSAGGCOGA^ 
CAACTCa:X3GGACCAGCTXK5AAAGGGTCOT 

GGAGAAGACCGGCAAGCGCTCCACCAGCGCCGCCHSTCCTGGAGGCCCrCCGCGAGGC 

CGTGGAGAAGATCCTGCAGTACCGGGAGCrCACCAAGCTGAAGAGCACCTACAW 

GGACCTCATCCACCCCAGGACGGGCCG(XTCCACACCCGCTTCAACCAG 
30 CAGGCTAAGTAGCTCCGATCCCAACCnx:CAGAACATCCCCGTCCGCAC^^ 

CCG<XXjGGCCTTCATCGCCGAGGAGGGGTGGC^^ 

CAGGGTGCTGGOOIACCTCTCCGGCGACOAQAAC^ 

CX:ACACGGAAACm:CAGCTQOATGTTCQGCQTC(^^ 

GGCXKKJCAAGAOIATCAACTTCGGQGTT^ 
35 AGa:ATCCCTTACGAGGAGGCCCAGGCCTrCATTGAGCGCTAC^ 

GGCCTGGATTGAGAAGAO^CTGGAGGAGGGCAQOAGGCGGGGGTACGTGGAGAO^ 
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gtcgccgctacgtgcx:agacctagaggccogggtgaagagcgtgc^^ 
gccttcaacatgcccgtccagggcaccgcxxkxxiaan^ 
cccaggctggaggaaatgggggccaggatgctccttcaggtccacgacgac^ 
ccaaaagagagggcggaggccgtggcccggctggccaaggaggtcatggagggggt^^ 
5 ggccgtgccx:ctggaggtggaggtggggataggggaggactggctctctgccaagga 



Theimostable clone T7-88: Anoino Acid Seqaeace 

10 

MLPUHEPKGRVIXVDGIfflLAYRTF^^ 

SSRHEAYGGYKAGRAPTPEDFPRQIAUKELVDLIXSIJ^^ 

ADKDLYQlXSDRIHVlJIPEGYLITPAmWEKyGIja^ 

EEWGSLEAIJLEhIIJ>RIia>AIREKIL^ 
15 GSIXHEFGU^SPKALEEAPWPPPEGAFVGFVl^ 

EARGIJLAKDMVIJULIUBGIXiLPPGDDI^^ 

NLWGRLEGEERLLWLYREVERPLSAVLAHMEATO 

NUISRDQLERVUT>BLGIPMGKTEKTGKRSTO 

HPRTGRLHTRFNQTATATGM^SSDPNIXJNIPWW^ 
20 DENLIRWQEGRDIHICTASWMFGWREAVDPyv^^ 

ERYFQSFPKVRAWmKTlJEEGRKRGYVErLFGRRRYWDI^ARV^ 

MKIJ\MVKIJTRlJEEMQAKh^^ 

WLSAKB* 

25 

Thermostable clone T9: Nucleic Acid Sequence 

GATGClXXCTCITmQAGax:^ 

ACClTa::AC«CCCTGAAGGGCCrrcAC«^ 

GCX:AAGAGCOXXTCAAGG<XXTrCA^ 
30 GCCCCCnx:cnTCCGCCACGAGGCCrA03GGGGGTACAAG 

mCCaXKSCAACTCGCCCTCATCAAGGAGCTC^ 

(XGGGCTACGAGGCGGACGACGTCCnXKK:CAGCCTGGCX:AAGAAGG 

GGTCCGCATCCTCACCGCCGACAAAGACCmA(X^ 

CCCGAGGGOTACCrCATCACCCXXWantK^ 
35 GCCX5ACTA0CGGGCX:crraACX:GGGQA0GAG 

GAAGACGGCGAGGAAGCITCTGGAGGAGTGGGGGAGCC^ 
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GGCTOAAGCCOKXATCCGGGAGAAGATCXTGGCCCA^^ 
TGGCCAAGGTGCGCACCGACCTGCCGCTGGAGGTGGAOT 
GAGAGGCTTAGGGCXnrTTCTGa^GAGGCTTGAGCTTGOCAGC^^ 
GAAAOaXCAAGGaXTGOAGQAGOCCTOCTGOaXXJCC^^ 
5 CriTrCCCGCAAGGAG<XX:ATGlXK3GCXX}AT^^ 

CACCGGGCCCCCGAGCOTATAAAGCCCrCAGAGAOrrGAAM 
GACCTGAGCGTTCTGGCCCTQAGGGAAGGarrTC^ 
GCCTACCTCCTGGACCCnTaiAACAa^ACC^ 
ACGGAGGAGGCGGGGGAGCGGGCCGOXTnaXJAGAOT 
10 TGAGGGGGAGGAGAGGCTCCTITGGCTTrACCXK^ 
CX^ACATGGAGGOJACGGGGGTGCGCCTGG^ 

CGAGGAGATCGCXXrGCCrcGAGGCCQAGGTCTTCCGCXrrGGCC^^ 
CCGAGACCAGCTGGAAAGGGTCCTCirr^ 
GACCGGCAAGaKTCCACCAGCGCOQCCGTCC^ 
15 OAAGATOC1XK:AGTA(XGGGAGCTCA<XL\AGC^ 
CAlXX:VlCXXX:AGGAa3GGC0GCCTXX:AC^ 

aagtagckx:gatxxcaacctxx:agaacatc^ 
ggccttcatcgccgaggaggggtggctattogtcgcoctggact^ 
gcxw}cccacctct(xggcgacgagaa<xnxlatxxxkm 
20 ggaoacogoiagctggatgttakkxjtcxjccc^^ 
caagaccatcaacttcgggqltxtctacggc^ 
cttacgaggaggca:aggccttcattgagcgctaci^^ 
ttgagaagaccctggaggagggcaggaggcgggggtacgtggagaoctc^^ 
taoltgccagacctagaggccxxiggtgaagagcgtgcgggaggcggco^ 

25 CATGCCXX3Ta:AGGGCACCGa:GCaiAC^^ 
GGAGGAAATXKKiGGCXlAGGATGCTtXT^ 

AGAGGGCGGAGGCCGTGGiXCGGCTGGCCAAGGAGGTCATGGAGGGGGTGTATar 

CCXXnX3QAGGTGQAQGTGGQGATAGGGGAGGACnK3CTC^ 

GGCAGCGCTTGGCGTCACCCGCAGTTCGGTGGTACTGGCCGTCGTm 

30 



niermostable done T9: Amino Acid Sequence 

J5 MIPLFEPKGRVIXVDGHHIAYRTFHAIJCGLTT^ 

SFRHEAYGGYKAGRAPTPEDITRQIJ\UKELVDUX5I^^ 
ADKDLYQUiiDRIHVUffiEGYIXrPAWLWEKYG^ 
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EBWGSl£ALUCNUDRIja>AIREKI^^ 
LGSLIJIEFGII^SPKALEEASWPPPEGAFVGFVI^RKEPM^ 
KEARGlIj\KDl^VIj\LREGmLPPGDDPMlIAYLI^ 
AM.WGRLEGEERIXWLYI^VERPLSAVIJVHMEATGVRm 
5 FNmSRDQI^RVLFDELGLPAIGKTEKTGKRSTSAAVL^ 
HPRTORUnMJNQTATATGRLSSSDPN^^ 
DENURWQEGRDIHTETASWMFGWREAVDPLK^ 
ERYFQSFPKVRAWffiKTUSEGRRRGYVBTLFGR^ 
NfKIj\MVKIJFPRLEEMGARM^ 
10 WLSAKB 



Themiostable clone T13: Amino Acid Sequ^ice 

15 

MU^IJEPKGRVIXVDGHHLAYRTFHAIiCGLTTSR^ 
SFRHEAYGGYIU^GRAPTPEDFPRQlj\liKELVDLLGIj\^ 
ADKDLYQUSDRIHVUIPBGYmPAWLWEKYG^ 
EEWGSl£AIIJBNU>RIJGPAII^^ 

20 GSLUBEFGIJLESPKAIJSEAPWPPPEGAFVGF^ 

EARGIIAia>LSVLAIJUBGLGLPPGDDPMIJj\^^^ 
bn.WGRI^EERLLWLYREVERPLSAVIAHMBATGVRm 
NLNSRIX}imVIJT>ELGIJ>AIGKTEKTGKRSTSAAV^^ 
HPRTGRUnRFNQTATATGRMSSDPNIXJl^ 

25 DENLIRWQEGRDnnETASWMFGWREAVDPU^^^ 
BRYFQSFPKVRAWIEKTLEEGRRRGYVETLFGRIUm^ 
Mia-AMVmPRIJBEMGARMLIXJVHDE^ 
WLSAKE 



30 Theimostable clone 8 (T8): Nucleic Acid Sequence 

TCGTGGTACGCATCCTCirriTGAGCXX^ 
TACCGCACXlTCCACGCCXrrGAAGGGCCrC^ 
GGCTTCXjCCAAGAGCCTCCTCAAGGCCCTCA^^ 
35 GCCAAGGCCCCCrCCTaX]Ga:AC»AGGOT 
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GAGGACmCCC<XK3CAACrCGCC^ 

GAGGTCCCGGGCTACGAGGCXKjACGACGTCCTGGCCAGCCTGGC^ 

CTATGAGGTCX;GCATCCrCACCG<XGACAAAGACCrmA^^ 

CTCCACCXXJGAGGGGTACCnXJATCACCCCGGCCTGGGTTTGGG^^ 
5 CAGTGGGCCGACTACXXKKKXXrroA(XXKKK^ 

GGGGAGAAGACGGCGAAOAAGCTTCTGGAGQAGTGGGGGAGOT 

GGACCGGCTQAAGCXXGCXlATaXKKlAGAAQAT^^ 

GGACCrGGCCAAGOTGaK:ACC0AC^^ 

Aa:GGGAGAGGCTTAGGGCCITIXnXK3AGA<^^ 
10 TTCTGGAAAG(XCCAAGGCCCTGGAGGAGGCCCX:CTGGCCCrc 

TTQTGCTITCCCGCAAGGAGCCXATGTGGQC^ 

GGGTCXJACCGGGCCXXICQAGCCITATAAAG^^ 

CCAAAGACCTGAGCX3TTCTGGC0CTAAGGG 

TCCIXXKX]TACXJltXTGGAC^^ 
15 AGTCGACGGAGGAGGCGGGGGAGCGGGCCGCCCmCCGAGAGGCTC^ 

AGGCTTGAGGGGGAGGAGAGGCTCCTTTGGC^ 

CTGGCCCACATGGAGGCCACAGGGGTGCGCCTGGAOjTGGCCTATC^ 

GTGGCCGAGGAGATC^CCOKXnXXjAGGCCGAGGTOT 

AACnmXKKiACC^GCTGGAAAGGGT(XT 
20 GAGAAGAa:X3GCAAGajCTXX:ACC^ 

GTGGAGAAQATCCTGCAGTAaXKKiAGCl^ 

GACCTCATCCACCCCAGGAOKKKXXKXnCCACACCCGCT^ 

AGGCTAAGTAGCT(X:GATCCX:AACCriXX;AGAAC^^ 

CGCCGGGCCirCATCGCCGAGGAGGGGTGGCTATTGGT^ 
25 AGGGTGCTGGOOCACXTOKXKKK^ 

CACACGGAAACX«CCAGCTX3GATGTT(XKK:^ 

gcxk3ccaagacx::atcaacttoggggtt^^ 
gccatc(xnta(:x3aggaggcccaggccttcattgag(^^ 

GCCTGGATTGAGAAGACCXTGGAGQAGGGCAGGAGGCGGGGGTACGTG^ 
30 CCGCCGCTACGTGCCAQACCTAGAGGCa:GGGTGAAGAGCGTGCGGGAGGCGGCCGAGCG<^ 
CCITCAACATG<XCGTCCAGGGCACCGCCGCa5AC^^ 
CAGGCTGQAGGAAATGGGGGCCAGGATGCTCXr^^ 
AAAAGAGAGGGeGGAGGCCGTGGCCajGCTGGOL^ 
OXiTGCCXXrKKlAGGTGGAGGTGGGGATAGGGQAGGACT 

35 

Tbemiostable clone 8 (T8): Amino Add Sequence 
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PIJFEPKGRYliVTCHHIAYRTimiiCGLT^ 
HEAYGGYKAGIU\PTPEDFPRQLAIJKELW 
KDLYQU^DRIHVUffEGYLITPAWLWEKYGLRPDQWADYR^ 
WGSLEAIXE>tt.DiaJKPAIREKIUVHTD 
5 miEFGII^PKAIJBEAPWPPPEGAFVGFVW 
RGIIjVKDLSVLAUlEGLGIJ>PGDDPMLLA 
WGRIJSGEEIUXWLYREVDRPLSAVLAH^ffiATO 
NSRTKJIJSRVLFDmXJIPAIGKTEKTGKRSTSAAV^^ 
TGRUnOTNQTATATGM-S^ 
10 NLIRWQEGIU)1H1ETASWMFGVPREAVDPU4RRA^ 

WQSFPKVRAWIEKTLEEGRRRGYVBTIJFGRiaiYVPDI^^ 

lAMVKLFPRI^EMGARMLLQVHDELVI^^ 

AKE* 

1 5 Note: First two amino adds at N texminas not sequ^iced. 



Heparin Resistant Clone 94: Nwleic Add Sequence 

20 

atttttqagcccaaoggcc»cq^ 
ccctgaagggccnxilaccaccagccggggggagccg^ 
tcxh'caaggccxn'caaggaggacggggacgcggtgatcg^^ 
tccgccacgaggojracggggggtacaaggcgggccxk^ 
25 aactcgccctcatcaaggagctggtggaccrcctgggw 
aggcggaojacgtcxtggccagcctggayvag 
ctcaoxjccgacaaagacc^ 
a(xtcatca<xcx:ggcctggcttt^^ 

GGGCCCTGACOKjGGACGAGTaXjACAACXrr^ 
30 AGGAAGCITCrGGAGGAGTGGGGGAG(XTGGAAGCXXTC^ 

CG(XATCX:GGGAGAAGATCCTGGCCCACATGGACGATCrrG;^ 

GCGCACCXJACCTGCCXXrrGGAGGTGGACTTCGCCAi^ 

GGGOCTTTCIGGAGAGGCITCAGT^^ 

AGOaCCGOAGGAGGCmxrrGGCCXmKXXXJA^^ 
35 AGGAGCCCATOTGQGCOQATCTTCTGQC^ 

CXXJAGCCTTATAAAGCCCrCAGGGACCTO 

TTCTQGCCCTGAGGGAAGGCCTTGGCCTC^ 
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(WACCCTTCCAACACCACCXXCQAGGGGGTGGCXXXK^ 
CGGGGGAGCXKKK3CGCCCTTTa:GAGAG<^ 
GAGAGGCTCCnTrGGCTTrACCGGGAGGTGGAGAGGC^ 
GCCACGGGGGTGCGCCTGGACGTGTCCTATCT^AGGGi^^ 
5 GCmjCCTCGAGGCCGAGGTCTTCCGCCTGGCCXKK: 
CTGGAAAOGGTCCTCTTTGACGAGCTA^^ 

axrrccAaiAGCGCCGcxjiOT^^ 

CAGTACCGGOAGCTCACXIAAGCrGAAGAGCACCT^ 
AGGACGGGCCGCCTCCAC^CCCGCTTeAACCAGAO^ 

10 GGTCCCAACXrrcrAGAGCATCCCCGTCCGCACCCCGCTTG<^ 
GOXJAGGAGGGGTGGCTATTGGTGGCOCTGQACTATA^ 
CTCTCCGGCGACGAGAACCTGATCCGGGTC^^ 
AGCTGGATGTTCGCK^GTCCCCXXiGGAGGCOT 
AACTrcGOGOTCCIUTAOKK: 

15 AGGCCCAGGCriTCATTOAGai^^ 

<XX:rGGAGGAGGGCAGGAGGCGGGGGTACGTGGAGACCCrrc^^ 
GAanViGAGGCCCGGGTGAAGAGCQTGOGGGAGGCGG<X»AG(XK:AT^^ 
CCAGGGCACCGCCGCOjACCTCATGAAGCTGGCTATGGTGAAGCTCT^^ 
GGGGGCCAGGATGCrCCrrcAGGTa:ACQACQAGCTXK}T^^ 

20 AGGOCX3TXKKXXXKKnX3GOCAAGGAGGl^^ 

TGQAGGTGGGGATAGGGGAGGACTGGCICTaXlOCAAGGAGTGAT^ 



25 Hq)a]inReinstaiitCloiieH94. Amino Ac^^ 

FEPKGRVLLVDGHHLAYRTFHA1XCH.TTSRGEPVQAVYOT 

MYGGYKAGRAPTPEDFPRQIJVUKELVDIIX3LAR^ 

DLYQII^DRfflVUIPEGYmPAWLWEKYGIJa*!^ 

30 GSUaAIIXr^RLEPAIREKIIJV^^ 

LHEFGIXESPKAPEEAPWPPPEGAFVGFVLSRKEPMWADIX^^ 
GLLAKDLSVIAU(EGLGIPPGDDPMIIAYL^ 
GIOiEGEEEULXWLYIUSVERPLSAVlJ^^ 
RDQLERVU1)ELGLPAIGKTEKTGKRSTSAAVI^^ 

35 (HOinMWQTATATGIO^SSGPmQSIFVR^ 

mWQEGM^IHTETASWMFGVPREAVDPLMRiUW^ 
QSFPKVRAWffiKTLEEQRRRGYVETLFQW^ 
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MVKLFPRIJSEMGARMLLQVHDELV^^ 
E* 

Note: N-TERMINAL 5 amiiio acids not detera 

5 



Heparin Resistant Clone IS: Nucleic Acid Sequence 

10 mGAGCCX:AAGGGCCGCGTCCTCCTGGTGGACGGCCACCACCTGGCC^^ 

TaAAGGGCCTCACCACCAGCCGGGGGGACKDCGGTGCAGGOGGTCTAC^^ 

TCAAGGCCCTCAAGGAGGACGGGGAOSOGGTQATCGTGGT^^ 

GCCACGAGGCXrrACGGGGGGTACAAGGCGGGCXXKjGCC^ 

CTCGCCCTCATCAAGGAGCTGGTGQAC^^ 
15 GCGGACGACGTCCTGGCCAGCCTGGCCAAGAAGGCGGAAAAGGAGGGCTACGAGGT 

CACCGCCGACAAAGACCTITACCAGCTXXTTTCCGAC^ 

CTCATCACCCCGGCCTGGCriTrGGGAAAAGTA(^ 

GCCCTGACCGGGGACGAGTCCXfcACAAOnT^ 

GAAGCTIUTGGAGGAGTGGGGGAGiXTGQAAQC^^ 
20 <XATXXGGGAGAAGATCXnXKKXX::ACATGGACX3ATC^ 

GCACCGACCTGCC03TGGAGGTGGACTTCGCCAAAAGG^ 

GCCnrrCTGGAGAGGCTTGAGTTTGGCAGCC^ 

GCCCTGGAGGAGGCCCCCTGGCCCCCGCCGGAAGGGGCCTTCG^ 

GAGCCX:ATGTGGGCCGATCTTCrGGCCCTGGCCGCCGCCAQG 
25 QAGCCTTATAAAGCCCrcAGGGACCTGAAGGAGGCQCG 

CTGGCCCTGAGGGAAGGCCTTGGCKn'CmK^^ 

ACOCITCCAACACQ\CCC^ 

GGQGAGOGGGCCGCCCTITCXXJAGAGGCTCTTCGa^ 

GAGGCTCCTTTGGCTITACCGGGAGGTGGAGAGGCCCC^^ 
30 TACGGGGOTGCGCCnX3GACGTGGCCrATCTCAGGGCCTT^^^ 

CCGCCTCGAGGCCGAGGTCnrCXiGCCTGGCCGGCCAC^^ 

GAAAQGGTCCrcirroACXUGCTAGGGCI^^ 

TCCAiXAGOGCCGCCGTCXnWj 

TAC(XKK3AGCTCACX:A0GCTOAAQA^ 
35 ACGGQCOGOnXX^ACACaXKnTCAAO^ 

(XXlAA(Xn:(X:AGAQCATC(X:CGTCC 

GAGGAGGGGTGGCTATTQGTGGCCHJIXiGAC^^ 
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TCX^GGCGACGAGAACCTGATCOKIGTCTTCCATO^ 

CTGCUTGTTC(KjCGTCCCCX:GGGAGGCCGTGGACC(XCTGATGCGCCGG(^ 
CITCGGGGT(XTCTACGGCATGTCGGCCCACCGC(^^ 
GCCCAGGCCTTCATTGAGCGCTACTTTCAGAGCTTCC^^ 
5 CTGGAGGAGGGCAGQAGGCGGGGGTACGTGGAGACCCTCTTCGGCCXSCCGCCGCTAaST^ 
CXTAGAGGC<XXKKjTGAAGAGCX3TGCGGGAGGCGGCC^ 
AGGGCACCGOCGCCGAaJTCATGAAGCTGGC^^ 
GGGCX:AGGATC(nXXnTCAGGTCX:AajAOGAG^ 

Ga:GTGG<XCGGCTGGCCAAGGAGGTCATGGAGGGGGTGTATCCCCTGGC^ 
10 GAGGTGGGGATAGGGQAGGACTGGCTCKXXSOIA^ 



15 Heparin Resistant Clone 15: Amino Add Sequence 

PIJEPKGRVIXVDGHHIJ^YRTFHAIXGLTTSRGEPVQ 

HEAYGGYKAGRAPTPm)FPRQlJUJm.VDI^ 

KDLYQU^DRIHVUIPEGYUTTAWLWBK^ 

20 WGSLEAIIJK3aDK[JBPAIR£K]IAI^^ 

SIXHEFGlIJBSPKAim^PWPPPEGAFVGFVI^SRK^ 
ARGIXAKDI^VIALREGIXjIJ»PQDDPMIJLAYI^ 
LWGRLEGEERLLWLYREVERPI^VLAHMEATGVRm 
LNSRIXJI£RVIJFDELGU>AIGKTEKTGKRSTS^ 

25 RTGRLHTRFNQTATATGRI^SGPNIX3SIP\^^ 

NURVKJBGimiHTBTASWMFGWREAVDPUd^^ 
YFQSFPKVRAWffiKlIJEEGRRRGYVET^ 
AMVKIWRLEEMGARMLLQ\ra>ELVLE^ 
KB* 

30 

Note; N-terminal 5 amino acids not detennined 



35 



Nfismatoh extension clone Ml: Nucleic acid sequence. 

TTOGAATOCTCCCTCTTXTK^ 
CCGCACCiraiAaKXXnXJAAGGGC^^ 
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CTTa5(XAAGAGCCTCCTCMGGaXTC 

CAAGQCCCCCTCCTTCCGCCACGAGGCCTACGGGGGGTACAAGG^ 
GGACnrCCCCGGCAACTCGCCCTCATCAAGGAGCT 

GGTCCCGGGCTAOiAGGCGGACGACGTCCTGGCCAGCCTGGCCAAGAAGGCC^ 
5 ACGAGGTCCGCATCCTCACCGCCGACAAAGGCCnTrACCAGCTCC^ 
CCACCCCGAGGGGTACCTCATCACXCCGG^ 
OTGGGOXiACTACCGGGCCCTQACaKK^ 
GGAGAAGACGGOSAGGAAGCTrcroGAGGAGTGGGGGAGC^ 

aocggcraaagoccgcscatccgggagaagatccrg^ 
10 atctggccaaggtgcgcaccgacctgccgctggaggtggacntcgcx: 

gggagaggcitagggcctrtctggagaggcttgagtl^ 

tggaaagcccx:aaggccctggaggaggccccctggcx;cccgccggaaggggcct^ 

tccttrcxx:gcagggagc5ccatot^ 

tocaccgggccqcxwagccttataaagcxxj^ 
15 aagacctgagcgttctogccctoagggaaggccttggcct 

tcgcctacctcctggacccttccaacaccacccccgagggggtggcccggcgct 

ggacggaggaggcgggggagcgggccgccctttccgagaggctcttcgcc^cctgt^^ 

CTTGAGGGGGAGGAGAGGCTanrrGGCTTTACX:^ 
GCCX^ACATGGAGGCCACGGGGGTGCGariGGACGTGGCCTATCT 
20 G0CGAGGAGATaK:CXXKICIXX}AGGCXX3AGGT^ 
TCSCCGGGACCAGCTGGAAAGGGTCCnU^ 

AAGACCGGCAAGCGCTa^ACCAGCGCCGCCGTCCTGGGGGCCCTCCGCGAGGCC^ 
GAGAAGATCXTGCAGTACCGGGAGCrCACCAAGCTGAAGAGCACCTACATT^ 
CTCATCCACCa:AGGACGGGCCGCCTCCACACCCG<nTCAACCAG 
25 CTAAGTAGCTCCGATCCCAACCTCCAGAACATrcCCGTCCGCACCCCGC^^ 
CXKK3CXrrrcATOGCa3AGGAGGGGTGGCTAT^ 
GTGCIGGCCCACCTCimjGOGACQAGAAC 
A0GGAGA(XX3CCAGCTXKiATGTO:GGCXjTC 

GCCAAGACCATCAACTTCGGGGTCCTCTACGGCATGTCGGCCCACCGCCT 
30 ATCCCITACX5AGGAGGCCCAGGCCTTCATTGAGCGCTACm 

TGGATTGAGAAGACCCTGGAGGAGGGCAGGAGGCGGGGGTACGTGGAGACCCTCTTCG 

CCGCTACGTGCCAQACCTAGAGGCCCQGGTGAAGAGCGTGCGGGGGGCGQCCGAGCGCATGGC^ 

TCAACATGCCCXiTCCAGGGCAOaK^^ 

GGCTGGAGGAAATGGGGGCCAGOATQCTCCTTCAGGTC^ 
35 AAGAGAGOGCGQAQGCCGTGGCCaK}CTGGCCAA 

GTGCCCCTGGAGGTGGAGGTGGGGATAGGGGAGGACTGGCTCTCCGCX^AAGGAGT^ 

GCAGGCAGCGCTTGGCGTCACX:CGCAGTTCGGTGGTTAATAAGOT 
GCGCACATTGTGCGACATTriTnTGTCTGCCGTI^ 

CTGTAGCGGCGCArrAAGCGCGGCGGGTGTGGTGGTTACGCGCAGCGTGACCGCr 
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CmXrrAQCGCXX:GCTCCTTTCGC^^ 

AAGCTCTAAATCGGGGGCraXrmAGKK^ 

AAAATTGATTAGG 



5 Mismatch extemion clone Ml : Amino acid sequence. 

GMLPUFEPKGRVLLVIXJHHIAYRTFHA^ 

PSFRHEAYGGYKAARAPTPEDFPRQIJVLIKELVDIXGIJ^^ 

TiU^KGLYQII^DRIHVIilPEGYUTPAWLWEK^ 

IfEWOSIJBAIIJQqUDRI^ 
10 mrGSII.HEFGIXESPKALEBAPWPPPEGAFVQFVLSBI^ 

KEARGlXAKDLSVIj^LREGIXJUTGDDPim^ 

ANLWGia.EGEERLLWLYREVERPLSAVlJ^HMEATGVR^ 

FNUaSlU>QLERVIJ^ELGU»AIGK^ 

IHPRTGRLinTlFNQTATATGKLSSSDPMXJl^ 
15 GDBNLffiVFQEGRDIHTETASVi^GVP^ 

lERYFQSFPKVRAWlEKTLEBGRRRGYVETLFGR^ 

MKLAMVKUTRI^EMQARA^^ 

WLSAKB 

20 Note: ITtCTunal 2 amiao acids not determiaed. 



Mismatch extension clone M4: Nucleic acid sequence. 

TXjnTATOAGCCCAAGGGCWKXJT^^ 
25 GC(XrrQAAGGGCCTCACCACCAGCCGGGGGGAGC^^ 

arrCCrcAAGGCCCTCAAGGAGGGCGGGGAOiCGGTGAT^^ 

CTTCCCCCATGAGGCCTACGGGGGGTACAAGGCGGGCCGGGCCCCCACGCCGGAGGAC^ 

ACAACT(XiCCCTCATCAAGGAGCTGGTGGACCTCCTGGGGCT^ 

CGAGGCGGACX3ACGTCCTGGCCAGCCTGGCCAAGAAGGCGGAAAAGGAGGGCT 

30 TCCTCACCGCCGACAAAQACCTTTACCAG^^ 

GTACCTCATCAaXXXKKXntSGC^ 

C03GGCCCTGA(XGGGGACGAGTCa}AC^ 

CGAGGAAGCTTCrGGAGGAGTGGGGGAGCCTGGAAGCCCT<X^ 

CCJaSCCATCCGGGAQAAGATCCTGGCCXIACAT^ 
35 GTGCGCACCGACXJTGCOXTGGAGGTGGAC^ 

TAGGGCCTTICTGGAQAGGCTTGAGTTrGGCAGCCT^ 
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AAGGCXXTGGAGGAGGCCCCCTGGCCm^ 
AAGGAGCCCATGTGGGCCGATCrTCTAGCX:CTGGCCG^ 
CCCGAGCCTTATAAAGCCCTCGGGGACCTGAAGGAGGCGCGGGGGCTTCrC^ 
GTTCTGGCCCTGAGGGAAGGCCTrGG(XTCCCGCCCGACGAC^^ 
5 TGGACanrrcAACACCACXCCCGAGGGGGTGGCCCGGCGCTACGG^ 
GCAGGGGAGCXjGGCCGCanrrcaiAGAGGC^^ 
GGAAAGQCrccnriGGCrTTAC^^ 

GGCCACGGGGGTWXKXnrGGAaSTGGCCTATCTCAGGGCC^ 
OGOCOGCCTa5AGGCa3AGGTCTrCCG<^ 

10 gctggaaagggtcctctttgacgagctagggcitcc^ 

gcgctcca(xagcgccgccgtcxnxkk3ggccctc^ 

gcagtaccx3ggagctca(x:aagctgaagagp\cctacattgacccot 

caogacgggccgcctcxacacccxk:^ 

a}at(xcaacctccagagcatxxm}tccgc^ 
15 cgccgaggaggggtggctattggtggccctcgacta^^ 

cctctccggcgacgagaacctgatccgggtcttcxaggaggggcgggacat^^ 

CAGCTGGATGllXXjGCGTCCCCCGGGAGGCCGlXKiACC^^ 
CAACITOKKSGTCCTCrACGGCATGTCGGCCCAC^ 
GAGGCCCAGGCCTTCATTAAGCGCTACTTICAGAG 
20 ACCCTGGAGGAGGGCAGGAGGCGGGGGTACXiTGGAGACC^^ 
AGACXTAGAGGa:CXKK3TGAAGAGCGTGCGGGAGCXX3GCCGA 
TtX:AGGQTACCGa:GC«JAantIATG^ 

TGGGGGCCAGGATGCTCCTTCAGGTCCACGACGAGCTGGTCCTC 
GAGGCCGTGGCCCGGCTGGCCAAGGAGGTCATGGAGGGGGTGTATCCCCTGGCC^^ 
25 GTGGAGGTGGGGATAGGGQAQGACTGGCTCTCCGCCA^GGAGTGAGT 



Mismatch extension clone M4: Amino add sequence. 

LYEPKGRVIXYTCHHLAYRTFHAIiLGLTre 
HEAYGGYKAGRAPTPEDFPRQIJajKEL^ 

30 DLYQLLSDRIHVUffEGYLITPAWLWEKYGl^ 
GSLEAUJCNIJ)RIJO>AIREKlIJ^I^ 
UJffiFGIXESPKALEEAPWPPPEGAFVGFVLSKKEPN^ 
RGLIJ^I^VUVLREGLGI^DDDPMUAYIXDPSK^ 
WGRLEGEERLLWLYREVERFI^SAVLAHMEATGVI^ 

35 NSW)QimVIiT>EIXHPAIGKTE!C^ 

RTGRIJ^TRFNQTATATGW-SSSDP^^X^SIPV^^ 
NLIRWQEGRDlirrETASWMFGWREAVDPI^^ 
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YFQSFPKVRAWIEKlT:.BBGRBRGYVETLFOIUtRYWDI£ARV^^ 

LAMVKIiTIU£EMOAI»iIIjQVHDELVI£APKBE^^ 

AKE 

Note: N-traminal 6 amino acids not detennined 
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Claims 

1. A metibod of selectii^ a nucleic acid processing (NAP) enzyme, tfie method 
comprising the steps of: 

(a) providing a pool of nucleic acids comprising members encoding a NAP enzyme or 
5 a valiant of die NAP en2yme; 

(b) subdividing the pool of nucleic acids into compartments, such that each 
compartment comprises a nucleic acid member of the pool together with the NAP 
enzyme or variant encoded by the nucleic add member; 

(c) aUowing nucleic acid processing to occur; and 

10 (d) detecting processing of the nucleic acid member by the NAP en:^me. 

2. A method of selecting an agent capable of modiJfying the activity of a NAP enzyme, 
the mediod comprising the steps of: 

(a) providing a NAP enzyme; 

(b) providing a pool of nucleic acids comprising members encoding one or more 
1 S candidate agents; 

(c) subdividing the pool of nucleic acids into conq>artments, such that eadi 
compartment comprises a nucleic acid member of the pool, die agmt encoded by the 
nucleic add m^ber, and the NAP enzyme; and 

(d) detecting processing of the nucleic acid member by the NAP enzyme. 

20 3. A method accordmg to Claun 2, in which die agent is a promoter of NAP enzyme 
activity. 
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4. A method according to Claim 2 or 3, in which the agent is an enzyme, preferably a 
kinase or a phosphoiylase, which is capable of acting on ^e NAP enzyme to modify 
its activity. 

5. A method according to Claim 2 or 3, in wiiich the agent is a polypeptide involved in a 
metabolic paih\vay, the pathvtray having as an end product a substrate which is 
involved in a nucleic acid processing reaction. 

6. A method according to Claim 2 or 3, in ^lichfhe agent is a polypeptide cspsble of 
producmg a substrate or consuming an inhibitor m a nucldc acid processing reaction. 

7. A me&od accotding to Claim 2 or 3, in which the agent is a polypqitide capable of 
modifying a nucleotide primer or nucleoside triphosphate substrate used in a nucleic 
acid processing reaction such that 

a) its 3' ^d becomes extendable; or 

b) a substrate portion appended to the nucleotide primer or nucleoside triphosphate is 
modified such as to aUow detection or cq}ture of product q>pendage of the 
incorporated nucleotide primer or nucleoside triphosphate. 

8. A method of selecting a pak of polypeptides capable of stable interaction, tiie mediod 
comprising: 

(a) providing a first nucleic acid and a second nucleic acid, die first nucleic add 
encoding a first fusion protein comprising a first subdomain of a NAP en^me fused to 
a first polypeptide, the second nucleic acid encoding a second fusion protdn 
comprising a second subdomain of a NAP en^me fused to a second polypeptide; in 
which stable interaction of the first and second NAP subdomains generates processing 
activity, and in which at least one of the first and second nucleic acids is provided in 
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the foim of a pool of nucleic adds encoding variants of the respective first and/or 
second polypeptide(s); 

(b) subdividing the pool or pools of nucleic adds into compartments, such liiat each 
compartment comprises a first nucleic add and a second nucleic acid toge&er with 

5 respective fiisioninx>teins encoded by the first and second nudeic acids; 

(c) allowing the first polypeptide to bind to the second polypeptide, such that binding 
of the first and second polypeptides leads to stable interaction of the NAP subdomains 
to generate NAP en^me activity; and 

(d) detecting processing of at least one of the first and second nucldc adds by the 
10 NAP enzyme, 

9. A method of selecting a pair of polypeptides capable of stable interaction, the method 
comprising: 

(a) providing a fibrst nucleic acid and a second nucleic acid, the first nucleic add 
encoding a first fusion protein comprising a first subdomain of polypeptide arable of 

15 enhancing the activity of a NAP en2yme fused to a first polypeptide, tiiie second 

nucleic acid encoding a second fusion protein comprising a second subdomain of 
polypeptide equable of enhancing the activity of a NAP enzyme fused to a second 
polypeptide; in which stable interaction of the first and second NAP subdomains 
generates processing activity, and in which at least one of the first and second nucleic 

20 acids is provided m the form of a pool of nucleic acids encoding variants of the 

respective first and/or second polypeptide(s); 

(b) subdividing the pool or pools of nucldc adds into compartments, such that eadi 
compartment comprises a first nucleic acid and a second nucleic add together with 
respective fiision proteins encoded by the first and second nucleic adds; 

25 (c) allowmg the first polypeptide to bmd to the second polypeptide, sudi that bindiiig 

of the first and second polypeptides leads to stable interaction of the subdomains of the 
NAP activity enhancing agent to generate NAP enzyme activi^ and 
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(d) detecting processing of at least one of the first and second nucleic adds by flie 
NAP enzyme. 

10. A method of selecting a polypeptide capable of stable folding, the method compridng: 

(a) providing a pool of nucleic acids comprising members encoding one or more 
5 candidate polypeptides fused to aNAP enzyme; 

(b) subdividing the pool of nucleic acids into compartments, such that each 
compartment comprises a nucleic arid member of the pool, the fusion polypeptide 
encoded by tiie nucleic acid member, and the NAP enzyme; and allowing for folding 
of die fusion polypeptide such diat a folded fusion polypeptide will not inhibit NAP 

10 activity; and 

(c) detecting processing of the nucleic acid member by the NAP enqone* 

I L A method of selectmg a polypeptide capable of promoting stable folding, tiiie method 
comprising: 

(a) providing a poorly folding polypeptide fused to a NAP enzyme and the candidate 
15 chap^ne; 

(b) providing a pool of nucleic adds conqnising members encoding one or more 
candidate chap^nes; 

(c) subdividing the pool of .nuclric adds into compartm^xts, sudi that eadi 
compartment comprises a nucldc acid member of the pool, the candidate chaperone 

20 encoded by flie nucleic acid member, the NAP-enzye fuaon to tiie poorly folding 

polypeptide; and allowing for chiperone-aided folding of Ihe fission polypeptide sudi 
fliat a folded fusion polypeptide will not mhibit NAP activity 

(d) detecting processing of the nudeic add member by the NAP enzyme. 



12. 

25 



A method according to any preceding claim, wherdn fte NAP enzyme is a replicase 
enzyme. 
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13. A method according to any pi^ceding claim, v^ecein the NAP enzyme is a replicase 
enzyme and the processing of the nucleic acid member constitutes replication of fte 
whole or a segment(s) of said m^ber. 

14. A method according to any preceding claim, wherein the NAP enzyme is a replicase 
5 enzyme and the processing of the nucleic acid member comprises either a fill-in 

reaction of a S* ov^diang appended to said member or an extension of a 3' end of said 
member. 

15. A method according to claim 12, m vMch anq>lification of the nucleic acid results 
fix»m more than one round of nucleic acid replication. 

10 16. A method according to any one of claims 12 to 15, in which tiie amplification of the 
nucleic add is an expon^itial amplification. 

17. A method according to any one of claims 12 to 16, in which the amplification reaction 
is a polymerase chain reaction (PGR), a reverse transcriptase-polymerase cham 
reaction (RT-PCR), a nested PCR,a ligase cham reaction (LCR), a transcription based 

15 amplification system (TAS), a self-sustaining sequence replication (3SR), NASBA, a 

transcription-mediated amplification reaction (TMA), or a strand-displacement 
amplification (SDA). 

18. A method according to any of Claims 12 to 17 as dependmt upon claim 1, in widch 
the post-amplification copy number of the nucleic acid member is substantially 

20 proportional to tiie activity of the rqdicase. 



19. 



A method accordmg to any one of Clauns 12 to 17 as dependmt upon claim 2, in 
yAAch the post-amplification copy number of the nucleic acid member is substantially 
proportional to the activity of the agent 
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20. A metibod accoiding to any of Claims 12 to 17 as dependent iq)on claims 3 to 5, in 
vMch the post-amplification copy number of the nucleic acid member is substantially 
proportional to the binding afSnity and/or kinetics of the first and second polypeptides. 

21. A method according to any one of claims 12 to 20, in \^4iich nucleic acid replication is 
detected by assaying the copy number of the nucleic acid member. 

22. A mediod according to any one of clauns 12 to 20, in which nucleic acid replication is 
detected by assaying the copy number of segments of the nucleic add member. 

23. A metfiod according to any one of claims 12 to 20, m which nucleic acid i^lication is 
detected by assaying the presence of tagging of the nucleic acid member. 

24. A method according to any one of claims 12 to 23, in which nucleic acid replication is 
detected by determining the activity of a polypeptide encoded by the nucleic acid 
member. 

25. A method according to any preceding claim, in which the conditions in the 
compartment are adjusted to select for a NAP enzyme or an agent active imder such 
conditions, or a pair of polypeptides cq)able of stable interaction under such 
conditions. 

26. A method according to any one of claims 12 to 24, in which the leplicase activity is a 
templated leplicase activity such as a polymerase, revi^:se transa:q>tase or ligase 
activity. 

27. A method according to any preceding claim, in which the polypeptide is provided fix^m 
the nucleic acid by in vitro transcription and translation. 

28. A mediod according to any of Claims 1 to 18, in which the polypq>tide is i»ovided 
fiom the nucleic add in vivo in an expression host 
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29. A method accoiding to any preceding claini, in which the compartments conq>]ise 
aqueous compartments of a water-in-oil emulsion. 

30. A method according to Claim 29, in which the water-in-oil emulsion is produced by 
emulsifying an aqueous phase with an oil phase and a surfectant comprising 4.5% vA^ 

5 Span80, 0.4% v/v TweenSO and 0.1% v/v TritonXlOO, or a surfectant comprising 

Span80, TweenSO and TritonXlOO in substantially the same proportions. 

31. A NAP enzyme identified by a method according to any preceding clainL 

32. A NAP enzyme according to Claim 31, which has a greato: thennostabilily 4an a 
con-esponding unselected eo2yme. 

10 33. A NAP en2yme according to Claim 3 1 or 32, which is a Taq polymerase havmg more 
than 10 times increased half-life at 97.5 ^'C when compared to wild type Taq 
polymerase. 

34. A Taq polymerase mutant comprising the mutations: F73S, R205K, K219E, M236T, 
E434DandA608V. 

15 35. A NAP enzyme accordmg to Claun 31, inhibited to a lesser esctent by heparin 
Ifaan is a corresponding unselected enzyme. 

36. A NAP enzyme according to Claim 31 or 35, which is a Taq polymerase active at a 
concentration of 0.083 units/pl or more of heparin. 

37. A Taq polymerase mutant conqprismg the mutations: K225E, E388V, K540R, D578G, 
20 N583SandM747R. 
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38. A NAP eozyme according to Claim 31 ^ch is aieplicase enzyme and is capable of 
extending a pimer having a 3' mismatch. 

39. A NAP enzyme according to Claim 3 1 which is a replicase enzyme and is capable of 
extending a primer having a 3* unnatural base (e.g. 5-nitroindole, or 3-carboxyamide- 

5 5-nitroindole). 

40. A NAP ^izyme accordmg to Claim 31 which is a replicase enzyme and has an 
enhanced cs{>ability to utilize Ortiiio dNTP's as nucleotide substrates. . 

41. A NAP en^me according to Claim 31 which is a replicase enzyme and has an 
enhanced capability to replicate substrates 2Skb in size in the absence of processivity 

10 &ctors or a 3*S'exonuclease proof-reading domain. 

42. A replicase enzyme according to any one of claims 38 to 41, in which the 3* mismatch 
is a 3' purine-purine mismatch or a 3' pyrimidine-pyrimidine mismatch. 

43. A replicase enzyme according to any one of claims 38 to 42 in yAnch the 3' mismatch 
is an A-Q mismatdi or in which the 3* mismatch is a C-C mismatch. 

IS 44. A Tcy polymerase mutant comprismg the mutations: G84A, D144G, K3 14R, E520O, 
check F598, A608V, E742G 

45. A Taq polymerase mutant comprising the mutations: D58G, R74P, A109T, L245R, 
R343G, G370D, E520G, N583S, E694K, A743P 

46. A wata:-in-oil emulsion obtainable by emulsifying an aqueous phase with an oil phase 
20 in the presence of a surfectant comprising 4.5% vA^ Span80, 0.4% v/v TweenSO and 

0.1% vA^ TritonXlOO, or a surfiictant comprising SpanSO, Tween80 and TritonXlOO in 
substantially Hie same proportions. 
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3. I I aalmsNos.: . I • 
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. i • 

Box II Observations where unity of invention is (aclcing (Continuation of item 2 of first sheet) 
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This International Searching Authority found- multiple (groupsi of) . 
inventions in this international application, as.fdllows: 

1. Claims:. (l-7)-coraplete, ( 12-30 )-parti ally 

» 

A method of selecting a nucleic acid* processing (NAP) 
enzyme, the method cotnprlsing the step of: (a) providing. a 
pooVof nucleic acids [comprising members encoding a* NAP 
enzyme or a variant of the NAP enzyme; (b) subdividing the 
pool of nucleic acids into compartments, such that each 
compartment comprises 'a nucleic acid member of the poo] ^. 
together with the nAPjenzyrae or variant encoded by the . 
nucleic acid member; (c) allowing nucleic acid processing to 
occur; and (d) detecting processing of the nucleic acid 
member by the NAP enzyme; 

2. Claims: (8,9)-complete, ( ^2-30 )-parti ally 

A method of selecting •a pair of polypeptides capable of 
stable Interaction, tl^e method comprising: (a) providing a 
first nucleic acid encoding a first fusion protein 
comprising a first subdoain of a NAP enzyme fused to a first 
polypeptide, the second nucleic acid encoding a second 
fusion protein comprising a second subdomain of a NAP enzyme 
fused to a second polypeptide; in which stable interaction 
of the first and second NAP subdomains generates processing 
activity; and in which at least on of the first and second 
nucleic acids is provided in the form of a pool of nucleic 
acids encoding variants of the respective first and/or 
second polypeptide(s); (b) subdividing the pools of nucleic 
acids into compartments, such that each compartment 
comprises a first nucleic acid and a second nucleic acid 
together with respective fusion proteins encoded by the 
first and second nucleic acids; (c) allowing the first 
polypeptide to bind to the second polypeptide, such that 
binding of the first and second polypeptided leads to stable 
interaction of the NAP subdomains to generate NAP enzyme 
activity; and (d) detecting processing of at least one of 
the first and second nucleic acids by the NAP enzyme; 



3- Claims: (10,ll)-complete, (12-30 )-parti ally 

A method of selecting a polypeptide capable of stable 
folding, the method comprises: (a) providing a pool of 
nucleic acids comprising members encoding one or more 
candidate polypeptides fused to a NAP enzyme; (b) 
subdividing the pool of nucleic acids into compartments, 
such that each compartment comprises a nucleic acid member 
of the pool, the fusion polypeptide encoded by the nucleic 
acid member, and the NAP enzyme; and allowing for folding of 
the fusion polypeptide such that folded fusion prolypeptlde 
will not inhibit NAP activity; and (c) detecting processing 
of the nucleic acid member by the NAP enzyme; 
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4. Claims: 31-316,38-42? 

A NAP enzyme; said enzyme which has a greater • 
the.rroostability than th^ corresponding unselected gene, said 
enzyme which is a Taq polymerase having more than 10 times 
increased half-life at 97.5°C when compared to wild type Taq 
polymerase; said Taq polymerase mutant comprising the 
mutations; F73S, R205K, K219E, M236T, E434D and A608V; 



5. Claim : 37 ' , 

I • . > 
Taq polymerase mutant comprising the mutations: K225E; 
E3.88V, K540R, D578G, N583S and M747R; 



6. Claim : 44 

Taq polymerase mutant comprising the mutations: G84A, D1446, 
K314R,.E520G, check F598, A608V, E742G; 



7. Claim : 45 ■ " 

Taq polymerase mutant comprising the mutations: D586, R74P, 
A109T, L245R, R343G, 6370D, E5206, N383S,'E694K and A743P; 



8. Claim : 46 

A water-in-o11 emulsion obtainable by emulsifying an aquaous 
phase with an oil phase in the presecnce of a surfactant' 
comprising 4.5% v/v Span80, 0,4% v/v Tween80 and 0,1% v/v 
TrItonXlOO, or a surfactant comprising SpanSO, TweenSO and 
TrItonXlOO in substantially the same proportions; 
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